4k €? 25 OCT 73% s NAVAL fi€S£flRCH LOGISTICS o _ o ) SEPTEMBER 1973 VOL. 20, NO. 3 OFFICE OF NAVAL RESEARCH Li-nJ-'Fj NAVSO P-1278 NAVAL RESEARCH LOGISTICS QUARTERLY EDITORS F. D. Rigby Texas Tech. University B. J. McDonald Office of Naval Research O. Morgenstern New York University S. M. Selig Managing Editor Office of Naval Research Arlington, Va. 22217 ASSOCIATE EDITORS R. Bellman, RAND Corporation J. C. Busby, Jr., Captain, SC, USN (Retired) W. W. Cooper, Carnegie Mellon University J. G. Dean, Captain, SC, USN G. Dyer, Vice Admiral, USN (Retired) P. L. Folsom, Captain, USN (Retired) M. A. Geisler, RAND Corporation A. J. Hoffman, International Business Machines Corporation H. P. Jones, Commander, SC, USN (Retired) S. Karlin, Stanford University H. W. Kuhn, Princeton University J. Laderman, Office of Naval Research R. J. Lundegard, Office of Naval Research W. H. Marlow, The George Washington University R. E. McShane, Vice Admiral, USN (Retired) W. F. Millson, Captain, SC, USN H. D. Moore, Captain, SC, USN (Retired) M. I. Rosenberg, Captain, USN (Retired) D. Rosenblatt, National Bureau of Standards J. V. Rosapepe, Commander, SC, USN (Retired) T. L. Saaty, University of Pennsylvania E. K. Scofield, Captain, SC, USN (Retired) M. W. Shelly, University of Kansas J. R. Simpson, Office of Naval Research J. S. Skoczylas, Colonel, USMC S. R. Smith, Naval Research Laboratory H. Solomon, The George Washington University I. Stakgold, Northwestern University E. D. Stanley, Jr., Rear Admiral, USN (Retired) C. Stein, Jr., Captain, SC, USN (Retired) R. M. Thrall, Rice University T. C. Varley, Office of Naval Research J. F. Tynan, Commander, SC, USN (Retired) J. D. Wilkes, Department of Defense OASD (ISA) The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics and will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Information for Contributors is indicated on inside back cover. The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, June, September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printing Office, Washington, D.C. 20402. Subscription Price: $10.00 a year in the U.S. and Canada, $12.50 elsewhere. Cost of individual issues may be obtained from the Superintendent of Documents. The views and opinions expressed in this quarterly are those of the authors and not necessarily those of the Office of Naval Research. Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulations, NAVEXOS P-35 Permission has been granted to use the copyrighted material appearing in this publication. SEQUENTIAL DETERMINATION OF INSPECTION EPOCHS FOR RELIABILITY SYSTEMS WITH GENERAL LIFETIME DISTRIBUTIONS* S. Zacks and W. J. Fenske Department of Mathematics and Statistics Case Western Reserve University Cleveland, Ohio ABSTRACT The problem of determining the optimal inspection epoch is studied for reliability systems in which N components operate in parallel. Lifetime distribution is arbitrary, but known. The optimization is carried with respect to two cost factors: the cost of inspecting a component and the cost of failure. The inspection epochs are determined so that the expected cost of the whole system per time unit per cycle will be minimized. The optimiza- tion process depends in the general case on the whole failure history of the system. This dependence is characterized. The cases of Weibull lifetime distributions are elaborated and illustrated numerically. The characteristics of the optimal inspection intervals are studied theoretically. 1. INTRODUCTION In the present study we investigate the problem of determining the optimal inspection epochs of a reliability system which is comprised of N components, operating independently (in parallel) and having the same lifetime distributions. The lifetime distribution is known. An inspector visits the system at a predetermined inspection epoch and finds a certain number of components which have failed. The exact times of failure are unknown. All the components which have failed during the interval between inspections are replaced by new components. Components which have not failed are left in the system. We consider two types of cost factors: (i) The cost of inspection, which depends on the number of components in the system; and (ii) The cost of failure per unit time. This cost com- ponent measures the loss due to a failure of components. The objective is to determine an inspection policy that would be optimal with respect to the criterion of minimizing the total expected (discounted) cost for the entire future. However, since we are dealing with cases of general lifetime distributions (not necessarily exponential) the dynamic programming solution is excessively complicated, even in the truncated case (when the number of inspections should not exceed a prescribed bound). There- fore, we are considering in the present paper a sequential myopic procedure. Accordingly, after each inspection the epoch of the next inspection is determined, as a function of the whole past failure history of the system. The aim is to minimize the conditional expected cost per time unit from the present time until the next inspection epoch. In the case of exponential lifetime distributions (constant failure rates) the optimal inspection interval (time interval between inspections) does not depend on the past history of the system. As shown in the present study if the lifetime distribution is not exponential this dependence might be very strong, especially if N is not large and the lifetime distribu- 377 378 S. ZACKS AND W. J. FENSKE tion is of a decreasing failure rate (DFR). The dependence of the optimal inspection intervals on the observed number of failures, and on the number of components that were replaced at previous in- spections and are still operating, will be explicitly characterized. We start in section 2 by formulating the model and the associated distributions. In section 3 we develop a general formula for the sequential determination of the optimal length of the inspection intervals. In section 4 we derive the corresponding formulas for lifetime distributions of the Weibull family; and illustrate the process with a numerical example. In section 5 we try to explain the complex process illustrated in the example of section 4 by further theoretical development. There are numerous papers in the reliability literature on inspections epochs and optimal mainte- nance. For the general theory see chapter 4 of Barlow and Proschan [1]. Articles which are close to the present study are those of Kamins [4], Kander [5], Kander and Naor [6] and Kander and Rabinovitch [7]. The present study provides further elaboration of a chapter in the thesis of Fenske [3]. The main dif- ference between the present study and the articles mentioned above is in the basic model. The present study is concerned with multicomponent systems while the other studies treat the whole system as one component. The study of Ehrenfeld [2] was based on a model similar to ours, but Ehrenfeld considered the problem of determining the inspection interval for the estimation of the mean time between failures in the exponential case. 2. THE MODEL AND ASSOCIATED DISTRIBUTIONS. Consider a reliability system which consists of N, N 3= 1, components. These components operate independently (in parallel). Let T designate the lifetime of a component. This is a random variable having a known distribution function (c.d.f)F (t). We assume that F{t) is absolutely continuous, with a positive density function f(t), < f(t) < 0, and F(0) = 0. We further assume that the expected value of T, according to F(t) is finite. Let So = and let So < Si < S 2 < . . . < S m < • ■ ■ designate a sequence of inspection epochs. Let J m (m = 1, 2, . . .) designate the number of components that failed during the time interval (S m -i, S,„). All the J m components are replaced at the inspection epoch S m . The N — J m components which have not failed during (S m -i, S,„) are classified into m disjoint subsets Ah m) ,A[ m) , . . . ,A m "L\ • The subset A^ n) = 0, . . . ,m — 1) contains all the components that were replac- ed at epoch Sj and did not fail throughout the time interval (Sj, S,„). Let /^ m) designate the number of elements of A ( f. Obviously, A*>"y l) C <4 ( "°and ra ( "] +1) *£ n l f for each; = 0, 1,. . . and m = j,j + 1 Letra < ^ ) = y m ,andn (m) = (rt ( ™ ) ,n ( 7 ) , . . .,n ( ™>) for each m = 0, 1, . . .;n ( °> = n. If a component belongs to the subset A^'p then its conditional lifetime distribution at time t is: (2.1) F<;<> (t) = P{T ^ t - Sj I T > S m - Sj} r 0, lit ss S,„ = F(t-Sj)-F(S m -Sj) l-F(S m -S } ) it >S, In particular, F\^(t) — F(t — S m ), if t > S m and zero otherwise. The conditional densities of U =T — (S m — Sj), corresponding to the life time T of a component which belongs to A ( "^ play an important role in our procedure. We can call U the remaining lifetime. It a component is chosen at SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS 379 random at time t = S, n + its remaining lifetime U has a conditional density function MulS^n^^^S/r r-^I-5^ " where S (M > = (S„. . .,S m ). We notice that if Z" has a negative exponential distribution, i. e. ,f(t) — Ke~ xt , f3*0, for any 0<\<«>, then A,„(«|S (W) , n<"'») = /(u) for all m = 1, 2, . . . and a// (S (m >, n< w >). This is a well known property of the negative exponential distributions. Let //,„(u|S (m) , n ( ' w) ) designate the c.d.f. corresponding to (2.2). 3. SEQUENTIAL DETERMINATION OF INSPECTION EPOCHS We will consider in the present section the problem of deriving an inspection policy which attains a certain economic objective. We assume therefore that the cost of inspecting the system is $C per inspec- tion, and on the other hand, if an element fails then the cost associated with its failure is $C/per time unit. The inspection policy adopted here is the following. Given the history connected with the past m inspection intervals, i.e., (S (w) , n (m) ) determine the (m + 1) st inspection epoch so that the average expected cost per time unit of inspection and of failure, over the (m+1) st inspection interval will be minimized. We remark in this connection that this policy is in essence a myopic policy, which minimizes the expected time average-costs for each inspection interval individually. A Dynamic Programming determination of the inspection epochs could attain a more global optimization. How- ever, attempts at Dynamic Programming solutions lead to complicated sets of recursive functional equations. The solution of these equations is generally very tedious. As will be shown later the suggested myopic procedure is globally optimal if the lifetime distribution is exponential. In other cases of interest, like the Weibull distributions the myopic procedure does not coincide with the global Dynamic Pro- gramming solution. A study of the relative efficiency of the myopic procedure is still under way. Let A designate the length of the (m 4- 1) st inspection interval. That is, A = S m +i — S m . Given (S <m) , n (m) ) the conditional expected average cost per time unit, under A, is (3.1) «„<* s<»>, o<«>) = §+ £| »<-> /; (a - u) f ( _"; ( s;;is;S *■ Or in terms of the conditional distribution of the remaining lifetime U we can express (3.1) in the form, (3.2) R,»(A; S< m >, n<'"») = 4°+ NC f H m (A\S^\ n< m >) - -% f N f uh m (u\^'"\ n"">)rf«. A A Jo The optimal (m+1) st inspection epoch is defined as'S m+ i = S m + A , where A is a positive real value, A, for which the infimum of (3.2) is attained. Let (3.3) fJim = y «/i m (u|S"' , >, n {m) )du, 380 S. ZACKS AND W. J. FENSKE be the expected remaining life, given (S (m) , n (w) )- According to the assumption of the previous section, fx,„ < oo. Differentiating R,„( A; S ( '">, n'"' 1 ) with respect to A we obtain that if /a,„ =£ Co/NC/then A = ». This is a case in which no more inspections are warranted. On the other hand, if ix m > Co/NCf, there exists a unique solution, A , to the equation: (3.4) P uM«|S ( '"\ n"">)du = ColNC f . We realize from (3.4) that S, n +i is a function of the statistic (S'"'*, n (m) ) of the system. As we have already mentioned in case's of exponential lifetime distributions the optimal length of the inspection intervals is the same for all m = 1, 2, . . .. If 6= X" 1 is the mean time between failures (MTBF) in the exponential case then /t,„ = 6 for all m, and the condition for a finite A is that Co < NC/6; i.e., the cost of inspecting an element is smaller than the expected cost of failure of an element. If this condition is satisfied then by letting y = Co/NC/O, it is easy to show that (3.5) A° = | x ;[4], where Xy [4] designate the y-fractile of a chi-square distribution with 4 degrees of freedom. 4. OPTIMAL INSPECTION EPOCHS FOR WEIBULL DISTRIBUTIONS Suppose that the lifetime of an element, T, follows a Weibull distribution, with a density function (4.1) f(t; 6, a) = ( 0, if t =£ ^r- 1 exp{-*"/0}, if t >0, u where a and 6 are positive real parameters. We notice that if < a < 1 then the distribution has a decreasing failure rate (DFR), and if 1 < a < oo its failure rate is increasing (IFR). When a = 1 the dis- tribution is exponential. Given (S ( ' H) , n (m) ) the density function of the remaining life U assumes the special form (4.2) 1 m a F«M"|S<»",n<»»»)= ^2n^ e xp{(Sm-Sj)"ld} ■ - d (u + S m -Sj)"-i exp{- (a + S m -Sj)"IB} , j=o for s£ u *s oo. When m = (4.2) reduces to (4.1). Following the procedure given in the previous section we realize that Si < oo if, and only if, (4.3) C o /A/<C / 0'/«rQ+l) 0i/a Y( — |-l) is the expected lifetime. If (4.3) is satisfied then the optimal value of Si is (4.4) SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS 381 where G ' (y \ p, v), is the y-fractile of the Gamma distribution G(p, ;<). with scale parameter p, and where y = ClyNCf* 1 " f( - + lj). We notice that if 2/a is a positive integer; then (4.5) s, = {fxU 2 +•>']} We determine now a general expression for the left-hand side of (3.4). According to (4.2), rA i m (4.6) u.h% o) (u|S< m >, n< m >) du = -fi £ ^ m) exp { (S m -Sj) a l6}. y I u(« + S m Sj) a ' exp {- (tt + Sm-Sj) /^} rf«. By a proper change of variable, we obtain (4.7) yf u(« + S ra -S j ) a -'exp {-(u + S„ i -S j }«/0}^ As -s +A)»/e / 1 \ = '" I [B^w^-iSm-Sj)] exp {-u;} rfw=0" u r — +1 J(S m -Sjfl B \« / [f-t^>.|«H^.'.^') — (S m — Sj) exp I- (S,«-Sj)" exp (5,,,-Sj + A)"] I Substituting (4.7) into (4.6), we obtain that S,„+i=Sm + A, where A is the root of the equation: < 4 - 8 ) ji 1 »r ^p { (s m -Sj)"ie} \g ( Sm "^~ A)a ; 1, ^+ 1) j=o L \ -'( a [l-exp {-- ±- [(S m -S J +A)«-(S«-S J )«]} y is as before, and G{x; p, v) is the c.d.f. of G(p, v) at x. 382 S. ZACKS AND W. J. FENSKE We notice that for m = the solution of (4.8) is reduced to the one given by (4.4). In figure 1 we illustrate the solution of (4.8) for three Weibull distributions, where the nj m) sequences were generated by Monte Carlo simulation. The cases under consideration have the following parameters: C/=$10., C o = $200 • N, 0=[hr] 100, and a = 3/4, 1, and 5/4. The case of a= 1 corresponds to the exponential distribution with mean 0=100. According to (3.5) the optimal inspection interval for a= 1 is of length [hr] 50 x\ [4] where y = Col W • C f = 0.2. One can find in any statistical tables that x 2 2 [4] = 1.65. Hence, the optimal interval between inspections in the exponential case is of length 82.5 hours. The case of a = 5/4 represents an IFR distribution. We see in Figure 1 that the optimal inspection intervals are of length which vary very little around 59 hours. It is interesting to notice that in the present case of an IFR distribution the optimal inspection intervals do not depend strongly on the number of components, N, in the system. This is not the case when the Weibull distribution is a DFR (a = 3/4). As illustrated in Figure 1 the optimal intervals for DFR distributions, as obtained from (4.8), are sensitive to N. When N=10 there are considerable fluctuations of the solution of (4.8). When N = 100 these fluctuations di- minish. The general trend of growth in the length of the inspection intervals is, however, the same. An explanation of this phenomenon will be provided in the next section. Finally we remark that the a= 3/4 5: 80 V) 75 - 70- (EXPONENTIAL) N ARBITRARY a = 5/4 5 10 15 20 25 30 35 40 45 50 INSPECTION NUMBER Figure 1. Optimum inspection intervals for Weibull Distribution with C/=$10,C„ = $200N, and = 100 (hr) SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS 383 numerical solution of Equation (4.8) in the case discussed here has been attained following the Newton- Raphson iterative corrections to an initial solution. For further details see Fenske [3]. 5. FURTHER THEORETICAL CHARACTERIZATION OF THE SOLUTION A characterization of the solution obtained from (3.4) is not a simple matter, since the inspection epochs S-2, S3, . . . are random variables depending on the random vectors n 1 '"' in quite a complicated manner. We remark that the sequences {n{ m) : m—j, 7+1, • • •} are supermartingales, for each j=0, 1,2,.. .; and Urn rcj" exists as shown. Furthermore, if S m _i— S m 3= A for every m then lim m— *<» m— kc n (m) = o f or eacn y. This property holds for any life time distribution F. In order to obtain certain theo- retical approximations to the distributions of the roots of (3.4) we consider a modified problem, in which for each m = 0, 1, . . . the random variables (n[ m \ j=0, . . . , m) are replaced by some fixed non- negative values. More specifically consider the distribution of the random variable. 1 '" f A (5.1) W^jj^nMll-FiSm-Sj)]- 1 ] uf(u + S m -Sj)du, in which the inspection epochs are predetermined fixed values. W m is the left-hand side of Equation (3.4). Whenever Si, S 2 , . . . are fixed inspection epochs the vector (/ij ( m) , n (m) , . . ., n {m) ) has for each m=l, 2, ... a multinomial distribution, with parameters N and (fl*'"'; ; = 0, . . ., m), where ^' n) m is the probability that an element belongs to /4< m) , ^ &.' n) = 1. The probabilities fr. m) can be determined j=o recursively according to the following formulae: (5.2) 8?»=l-F(S m ) and 3- ! = 0(m)=h-^dUA[l-F(S„-Sj)],j=l, . . .,m. It follows that for any fixed sequence of inspection epochs and for each m = 0, 1, . . . (5.3) ;'=0, 1, . . .,m Vari^'HA^U-^"") and (5.4) cov (nj" , 4 m >) = -N^ n) e^"\ a\\0^ j<k^m. From (5.2) and (5.3) we conclude that if the length of each inspection interval is not smaller than A then for any distribution F, lim n ( . m) — for each). 384 S. ZACKS AND W. J. FENSKE The variable W m is a linear combination of multinomial random variables. Its expectation is (5.5) u>„, = E{W m } = D^ + 2 1-£0U> DW, j=i L ,'_n -J where (5.6) Z)('»> = j uf (u + S„, - Sj) rfa. The variance of JF,,, is _i f - 0j m) (0j m) ) 2 / . ^'"W"') V] (5.7) Var{r '" } "iV||; ) [l-F(S„ 1 -S J )]^^5l-F(S m -S j )j]' We have shown that for any sequence of inspection epochs Var {W m } = 0(N~ 1 ) as N — » ». This explains why the fluctuations of the roots of (4.8) are relatively large when N= 10 and small when N = 100. We consider now a particular sequence of inspection epochs which consists of values of S m obtained by the repeated solution (for each m) of the equation (o m = y, i.e., r& in r j-\ n rA (5.8) uf(u)du+\ l-V0/i> uf(u + S m -Sj)du = y. h j=i L i= JJ( » Si is the root A of I uf(u)du = y, and for each m= 1, 2, . . ., the (m+ l)st inspection epoch is given Jo by Sm+i = Sm + A. The sequence of fixed inspection epochs determined by this procedure corresponds to the expected values of n im) and we therefore label this procedure as the Procedure Of Averages. In Table 1 we provide the inspection intervals determined by the Procedure of Averages, and the corre- sponding multinomial probabilities 0*." 1 * (/— »' • • •' m i) ■< f° r the two cases represented in Figure 1. The graph of the corresponding inspection intervals for the case of a = 3/4 (DFR) is also plotted in Figure 1. As is demonstrated in Table 1, in the IFR case (a = 5/4) the significant contribution to the solution is expected to be that of n^"' and n ( ^"J,, or of their corresponding expected values. Furthermore, the optimal length of the inspection intervals varies very little with the number of inspections, m, and its expectation reaches in the present example a stable situation after two inspections. This is not the case, however, in the DFR distribution (a = 3/4). The probabilities "' approach zero, as m grows, very slowly. This is reflected in a steady increase in the length of the inspection intervals as m grows, and a stable situation is reached in the present example only after 10 inspections. To insure that the inspection intervals discussed in sections 3 and 4 will have similar properties to those determined by procedures of fixed inspection epochs we could consider the following adjust- ment. First, determine for each m—1, 2, . . . two fixed sequences of inspection epochs which will constitute upper and lower (confidence) limits for the solution of (3.4) (or (4.8)). This can be done by utilizing formulae (5.5) and (5.7). The lower confidence limits could be obtained by repeated solution (for the root A) of the equation SEQUENTIAL DETERMINATION FOR RELIABILITY SYSTEMS 385 Table 1. Values of optimal inspection intervals A [hr] and multimonial probabilities under the Procedure Of Averages for Weibull distributions with 6=100 [hr] and cost components C — $200N, C f = $10 CaseI:a = 5/4(IFR) m opt. A 7=0 7=1 7=2 7 = 3 7 = 4 7 = 5 7 = 6 7=7 7 = 8 7 = 9 7=10 1 58.0 0.2017 0.7983 2 59.3 0.0211 0.1543 0.8246 3 59.1 0.0016 0.0161 0.1600 0.8223 4 59.1 0.0010 0.0013 0.0167 0.1595 0.8225 5 59.1 0.0001 0.0013 0.0166 0.1595 0.8225 6 59.1 0.0001 0.0013 0.0166 0.1595 0.8225 Case II: a = 3/4 (DFR) 1 147.6 0.6548 0.3452 2 156.8 0.4825 0.2216 0.2959 3 161.8 0.3666 0.1624 0.1880 0.2830 4 164.8 0.2839 0.1231 0.1372 0.1787 0.2771 5 166.6 0.2229 0.0953 0.1039 0.1302 0.1742 0.2735 6 167.9 0.1769 0.0748 0.0803 0.0984 0.1267 0.1716 0.2713 7 168.8 0.1416 0.0594 0.0630 0.0761 0.0957 0.1246 0.1698 0.2698 8 169.4 0.1142 0.0475 0.0500 0.0600 0.0739 0.0941 0.1233 0.1686 0.2686 9 169.9 0.0927 0.0383 0.0401 0.0473 0.0580 0.0726 0.0930 0.1223 0.1678 0.2678 10 170.3 0.0756 0.0311 0.0323 0.0379 0.0460 0.0569 0.0718 0.0924 0.1216 0.1671 0.2673 (5.9) a>„, + 3. [War {W m }] m = T, m=l,2, . . .. The upper limit can be obtained by solving the equation (5.10) o> m -3. [Var{r m }]" 2 = y, ro=l,2 In the second phase of computation solve Equation (3.4). If the solution lies between the roots of (5.9) and (5.10) proceed; otherwise truncate the solution to either the lower limit or to the upper limit, whichever is closer to the actual solution. Such an adjustment will guarantee that every inspection interval will be bounded by lower and upper values which are determined by fixed sequences of in- spection epochs, and will therefore have general characteristics as established here. REFERENCES [1] Barlow, R. E. and F. Proschan, Mathematical Theory of Reliability (John Wiley and Sons, New York, 1967). [2] Ehrenfeld, S. "Some Experimental Design Problems in Attribute Life Testing," J. Am. Stat. Assoc, 57,668-679(1962). [3] Fenske, W. J., "Optimal Inspection Epochs For Reliability Studies" Ph.D. Dissertation, Department of Mathematics and Statistics, Case Western University (1972). 386 S. ZACKS AND W. J. FENSKE [4] Kamins, M., "Determining Checkout Intervals for Systems Subject to Random Failures," The Rand Corporation, Paper RM-2578 (1960). [5] Kander, Z., "Inspection Policies of Deteriorating Equipment Characterized by N Quality Levels," Technion-Israel Institute of Technology, Operation Research Monograph No. 93 (1971). [6] Kander, Z. and P. Naor, "Optimization of Inspection Policies by Dynamic Programming, Technion- Israel Institute of Technology, Operation Research Monograph No. 61 (1970). [7] Kander, Z. and A. Rabinovitch, Maintenance Policies When Failure Distribution of Equipment is Only Partially Known, Technion-Israel Institute of Technology, Operations Research Monograph No. 92 (1972). AN EMPIRICAL BAYES ESTIMATOR FOR THE SCALE PARAMETER OF THE TWO-PARAMETER WEIBULL DISTRIBUTION G. Kemble Bennett Virginia Polytechnic Institute and State University and H. F. Martz Texas Tech University ABSTRACT An empirical Bayes estimator is given for the scale parameter in the two-parameter Weibull distribution. The scale parameter is assumed to vary randomly throughout a se- quence of experiments according to a common, but unknown, prior distribution. The shape parameter is assumed 'to be known, however, it may be different in each experiment. The estimator is obtained by means of a continuous approximation to the unknown prior density function. Results from Monte Carlo simulation are reported which show that the estimator has smaller mean-squared errors than the usual maximum-likelihood estimator. INTRODUCTION A large number of authors have considered estimation of the parameters of the Weibull distribution by the method of maximum likelihood, the method of moments, and numerous other classical tech- niques. Frequently, however, the parameters of the Weibull distribution are subject to random variation and an analysis which encompasses this feature is best suited. Such an analysis has been performed by Soland for the cases where the scale parameter is treated as a random variable [5] and where both the shape and scale parameters are treated as random variables [6]. These approaches exhibit a Bayesian viewpoint as adequately described by Raiffa and Schlaifer [4]. Emphasis is placed on determining con- jugate prior distributions and on performing both terminal and preposterior analysis. In this paper we obtain empirical Bayes estimates for the scale parameter. This approach, like the Bayesian approach, allows for the assumption of a randomly varying scale parameter. The analysis, however, does not require any specific assumptions as to the distributional form of this parameter. Since this distribution generally remains unknown, the empirical Bayes approach can, in a large majority of cases, be success- fully applied. Application would certainly be warranted, for example, in reliability life testing situations where the lifetime distribution of items subjected to routine testing is adequately described by a Weibull disbribution but where the scale parameter varied from test-to-test. THE EMPIRICAL BAYES APPROACH Consider the situation in which we observe a value t (which may be vector valued) from a Weibull density function given by (1) /(t|A) = A#0-> *-*'*' 387 388 G: K. BENNETT AND H. F. MARTZ and must estimate tiie parameter X with small squared-error. The shape parameter, /3, is assumed to be known, however, the scale parameter, X, which determines t, is itself assumed to be a realization of an unobservable random variable. Furthermore, it is assumed that this estimation process is a routinely reoccurring situation. Therefore, as the process is repeated we obtain a sequence of realizations of independent and identically distributed random variables t\, r 2 , . . ., t n - Our problem is to determine an estimator X„ = X„Ui, . . ., t„) which minimizes E(k„ — k n ) 2 , where the expectation is taken with respect to all the random variables involved and where X„ is the rath or current realization of X. Since k is itself a random variable this minimizing estimator is well known to be the Bayes estimator, the mean of the posterior distribution. This estimator can be represented by fkf{t\\)g(k)dk (2) MM*)- J/(,|A)s(X)dX' where g(k) is the true prior density function of X. Since the prior density usually remains unknown in practice, the estimator, E{k\t), cannot be exactly determined. It can, however, be approximated using the information, t x , t 2 , . . ., t„, obtained from previous realizations of X. Such an estimator is commonly referred to as an empirical Bayes esti- mator. For a detailed discussion of the empirical Bayes approach the reader is referred to, for example, Maritz [2]. To illustrate this situation, consider a repetitive testing program in which the time-to-failure density of tested items is given by (1). During each test a sample of k failure times is observed from (1) and an estimate of X is to be given. For example, at the first test a sample of k failure times is recorded and an estimate of Xi is required. At the second test an additional random sample of k failure times is obtained and an estimate of k>, which may be different from X x , is required. This situation continues until the present or rath test is completed, at which time an estimate of X„ is to be given. Due to changing en- vironmental conditions from test to test, imperfect testing equipment, interactions of population components, etc., the values Xi, X^, . . ., X„ are not likely to be equal, but to vary unpredictably and thus randomly. This variation can therefore be described by a prior density function. However, since the values of X remain unknown specification of g(k) can often be risky. In the situation described here the observed experimental data, t t . t>, . . . , t„, are used to approximate g(k) , thereby relieving the ex- perimenter of the task of specifying the form of g(k). DEVELOPMENT OF THE ESTIMATOR Suppose that at each experiment j comprising a testing program a random sample from (1) is taken and a maximum likelihood estimate (3) **=*/£ *1 i=l formed. Then the sequence of estimates (4) Xfc, i, Xfc,' 2 , • • ., Xfc.n provides a source of information on the past behavior of X. Based on this sequence of linear transforma- EMPIRICAL BAYES SCALE PARAMETER ESTIMATOR 389 tion can be performed which yields a new sequence of values (5) j , A 2 , . . ., A„ , which when considered collectively have a mean and variance approximating those of the random parameter A. This sequence can now be used to approximate the prior density function, g( A). Proper substitution of the approximation into (2) will yield an empirical Bayes estimate for A„, the nth or present realization of A.. The particular density approximation chosen is described by Parzen [3]. He presents a consistent density estimator of the form (6) S" (x) =7sWi r (w)' where W{-) is a weighting function satisfying certain boundedness and regularity conditions, and h ( n ) a smoothing constant so chosen that lim h ( n ) = and lim nh ( n ) = °° These restrictions are placed on h(n) to assure the consistency of g n (k). Parzen also lists several possible representations for »W{ • ). Using these results Bennett and Martz [1] suggest the replacement of each unobservable A, in (6) by its corresponding transformed estimate kf and form the density approximation Subsequent substitution of (7) into (2) yields the empirical Bayes estimator m e (Kit)- lA£Mh)£n{h)d±. The maximum likelihood estimator, A*, given by (3) is well known to be both consistent and sufficient for estimating A and can be easily shown to have the conditional density function (9) f(\k\^) = [kklk k ] k+1 exp[-k\l\ k ]ir(k)\k with mean and variance given by (10) E(k,c\k) = kkl(k-1) andVar (k k \k)= (kk) 2 l(k-l) 2 (k-2), respectively. Since the maximum likelihood estimator is sufficient for estimating A the Bayes estimator E{k\t), as defined by (2) can be conveniently written as £"(X |X>t) •, and the corresponding empirical Bayes estimator becomes (11) E„(k\k k )= 7 7 , ff(k k \k)g n (k)dk where /(Afr| A) and£„(A) are given by (9) and (7), respectively. 390 G. K. BENNETT AND H. F. MARTZ Since the actual range of A remains unknown the region of integration in (8) and (11) must be determined empirically. This can be satisfactorily resolved by taking the region of integration to be the observed range of estimates upon which the prior density approximation is based. Thus, it is only necessary to order successively the estimates Ai, k 2 , . . . , A,, to obtain the region of integration. Alter- natively, the positive half of the real line could be used. Let us now consider the linear transformation of k k defined by (12) k* = C t ' 2 [k k -E(k k )} + E(k), where (13) C = Var (A) /Var (A*). The mean and variance of A* are easily verified to be equivalent to those of the random variable A, e.g., E(k*) = E(k) and Var (A*) = Var (A). If the mean and variance of A were known then the transforma- tion could be applied to each of the maximum likelihood estimates of sequence (4) forming sequence (5). Since the mean and variance of A generally remain unknown, estimates of these quantities are required. Using relationships of conditional probability, we have from (10) that (14) E(k k ) = E[E(k k | A)] = [kl(k - l)]E(k) and that (15) Var (A*) = Var [E(k k | A)] + £[Var (\ k | A)] = [*/(*-l)] 2 Var (A) + [kl(k-l) 2 (k-2)]E 2 (k). From (14) the prior mean can be consistently estimated by (16) E(k) = [(k-l)lk] An, i - 1 -^ * where A„ denotes the sample mean - V k k , ,. If in (15) Var (k k ) is replaced by the sample variance, n A S„ = ]£ (A*,,-A n )2/n, E 2 (k) is replaced by the relation, E(k 2 ) =E 2 (k) + Var (A), and E(k) is re- i=i placed by A„, then upon solving for VAR (A ) , we obtain (17) Vm (k) = [(k-2)Sl-kf,] (k-l)lk* as an estimate of Var (A). Proper substitution of the above results into (12) yields (18) k* = Oi*[k k -k n ] + [(k-l)lk]k n , where EMPIRICAL BAYES SCALE PARAMETER ESTIMATOR 391 (19) C ={k-\)lk[(k-2)-KISl}. Thus, the transformation defined by (12) is completely determined and the empirical Bayes estimator given by (11) can be formed. MONTE CARLO SIMULATION To ascertain the usefulness of the empirical Bayes estimator, E„(\\\k), Monte Carlo simulation was employed on a UNIVAC 1108 computer. The criterion of comparison chosen was mean-squared error, and the widely utilized maximum likelihood estimator was the measurement reference. There- fore, the ratio empirical Bayes mean-squared error (20) R = : ' maximum likelihood mean-squared error was of interest. In the simulation, a value of X was randomly generated from a chosen prior density function selected from the Pearson family of distributions [7]. Then a random sample ti, t 2 , . . ., r* of size k, corresponding to the realization X was obtained from (1). The maximum likelihood estimate Xa- was then computed from (3) and its squared deviation (X — Xa-) 2 from the corresponding realization of X was calculated. For the second experiment, a new value of X was generated and the process repeated, obtaining Xa and its squared deviation. For this experiment, Zs2(X|Xa) and its squared deviation, [X — £" 2 ( X | Xa- ) ] 2 , from the corresponding realization of X were also calculated. This was repeated 20 times, and each time, Z?„(X|Xa-) was calculated using the present Xa, « as well as all previous maximum- likelihood estimates. Five hundred repetitions of this run of 20 experiments were then made, and the averages of the squared deviations of Xa and E,,{^\^k) were formed as estimates of E(k — Xa) 2 and E[k — £" ( X | Xa- ) ] 2 , respectively. Then the ratio R was calculated utilizing these estimated mean-squared errors. All numerical integrations were performed by means of the 11-point Gauss quadrature formula and the weighting function W {•) in (7) was taken to be r ( r) = [^]', where F=(X-X*)/2/i(") and h(n) = n- 1 ' 5 . This procedure was repeated for all types of Pearson prior distributions and the ratio R was observed to be significantly influenced by the prior distribution only through the ratio of the conditional variance of the maximum likelihood estimator to the prior variance of X. This value can be represented by (21) Z= k2EHk) (*-l) 2 (k-2) Var (X)' where E(\) , the prior mean of X, has been substituted for X. Since the only factors affecting the ratio R, apart from the number of experiences, are contained in (21) this quantity can be conveniently used to summarize and index a given situation. 392 G. K. BENNETT AND H. F. MARTZ It was generally observed in the simulation that the ratio/? varied only slightly for a given value of n, providing the value of Z remained invariant from distribution to distribution. These results indicate the robustness of the smooth empirical Bayes estimator to the form of the prior distribution. Also, it was observed that as Z increases, the values of the ratio R decrease. This phenomenon is best understood by considering the summary quantity Z. If Var (\a- | M is large as compared to Var (A) , then the maxi- mum likelihood estimate of A will vary widely. The empirical Bayes estimator, however, is capable of detecting this variation and can use this information to obtain better estimates of A. Conversely if Var (A* | A) is small as compared to Var (A), then the maximum likelihood estimator would be expected to do quite well. In this case there is a great deal of information within an experiment, and previous experiments contribute very little information about the parameter. This improvement, however, never surpasses that achieved by the empirical Bayes estimator. Values of/? are plotted, in figure 1, as a function of n, the number of past experiences, for different values of Z ranging from 0.5 to 5.0. For ease of presentation curves have been smoothed through the actual data points. 10 09 - 1 08 - 1 07 - \\ 06 ~~— - Z = l.0v 5 Z =05 04 03 _ Z =25 _ Z = 5.0 02 - 01 i i 1 1 1 1 i i i 6 8 10 12 14 NUMBER OF EXPERIENCES 16 18 20 FIGURE 1. Ratio of the average squared-error of E„(\\ kit) to the average squared-error of \a for several values of Z. Figure 2 illustrates a typical comparison between the improvements realized from using the linear transformation defined by (18) and by not incorporating this feature into the analysis. The dotted line represents the ratio R formed with an empirical Bayes estimator whose prior density approximation is directly based on the sequence of maximum likelihood estimates (4). The solid line represents the ratio R formed with the empirical Bayes estimator as defined by (11). Here the prior density approxi- mation is based on the transformed sequence given in (5). Both ratios illustrate the improvements over the maximum likelihood estimator achieved by both empirical Bayes estimators. Note, however, that the solid line is significantly lower than the dotted line. This result was repeatedly reproduced in the simulation for all values of Z considered. EMPIRICAL BAYES SCALE PARAMETER ESTIMATOR 393 1.0 || 0.9 - k 0.8 - Y\ 07 - \\ 0.6 IT I" 0.4 ^^^^_ 03 ^~~~- __ 02 0.1 1 1 1 1 1 1 1 1 1 6 8 10 12 14 NUMBER OF EXPERIENCES 16 18 Figure 2. Typical comparison of the ratio R with and without the linear transformation on Ki,; (- 20 -) with, ( ) without. REFERENCES [1] Bennett, G. K., and H. F. Martz, "A Continuous Empirical Bayes Smoothing Technique," Biometrika, 59,2,361-368,(1972). [2] Maritz, J. S., Empirical Bayes Methods (Methuen and Co., Ltd., London, England, 1970). [3] Parzen, E., "On Estimation of a Probability Density Function and Mode," Ann. Math. Statist. 33,1065-1076(1962). [4] Raiffa, H. and R. Schlaifer, Applied Statistical Decision Theory (Harvard Graduate School of Busi- ness Administration, 1961). [5] Soland, R. M., "Bayesian Analysis of the Weibull Process With Unknown Scale Parameter and Its Application to Acceptance Sampling," IEEE Trans. Reliability R-l 7, 84-90 (1968). [6] Soland, R. M., "Bayesian Analysis of the Weibull Process With Unknown Scale and Shape Param- eters," IEEE Trans. Reliability R-18, 181-184 (1969). [7] Thomas, D. G., "Computer Methods for Generating Pseudo-Random Numbers from Pearson Dis- tributions and Mixtures of Pearson and Uniform Distributions," Unpublished Master of Science Thesis, Virginia Polytechnic Institute and State University (1966). OPTIMAL ALLOCATION OF UNRELIABLE COMPONENTS FOR MAXIMIZING EXPECTED PROFIT OVER TIME Claude G. Henin Faculty of Management Sciences University of Ottawa Canada ABSTRACT In the present paper, we solve the following problem: Determine the optimum redun- dancy level to maximize the expected profit of a system bringing constant returns over a time period T; i.e., maximize the expression P I Rdt — C, where P is the return of the Jo system per unit of time, R the reliability of this system, C its cost, and T the period for which the system is supposed to work. We present theoretical results so as to permit the application of a branch and bound algorithm to solve the problem. We also define the notion of consistency, thereby determin- ing the distinction of two cases and the simplification of the algorithm for one of them. 1. INTRODUCTION In [4] we described different methods for solving the following problem. A serial system made of n independent stages, has a reliability of R and a cost of C, for a mission of a certain duration. If the system functions throughout the whole mission, the resultant revenue is P dollars. The problem there- fore was to maximize the expression PR — C, where R and C are increasing functions of the number of standby units at each stage. In this paper, we will consider a similar but more practical problem: we will suppose that, when working, the system produces certain revenue per unit of time. This seems to be more likely to happen in real life problems. For example, let us consider the orbiting of a commercial satellite or the place- ment of a submarine cable, which are sources of continual revenue as long as they function properly. In both cases, the reliability of the system can be increased significantly through redundancy before the system begins operations; however the system can only be repaired with difficulty once it fails. The reliability of the system is equal to the product of the reliabilities, Ri, of each stage i. At each stage, we have nti components; i.e., one basic unit and (mj — 1) standbys. At each instant t* the relia- bility Ri is an increasing function of mj, the number of components. The cost of a stage is micu where c t is the acquisition cost of one component of type i, and the system returns a net revenue of P per unit of time, while functioning. The problem is to maximize the profit for a period T (where T can be infinite); i.e., to maximize the expression: [T n (1) P\ R(mi, ...,m»;t)dt— V cum, Jo i? x n where R(m t , . . ., m n ; t) = FJ[ Ri(mu t). *The reliability Rt(t) of stage i is the probability that stage i is still working by time t. We neglect the influence of switching devices which could diminish the reliability of each stage n, if m becomes too large. 395 396 C. G. HENIN The functions /?j(m,-, t) are non-increasing in t because the reliability of each stage cannot increase with time. Another problem arises if we suppose that the returns are discounted at a rate r. In this case, the problem becomes (2) Ct n max P I e~ rt R(mi, . . ., m n \ t)dt — \ cm But it is clear that (2) is identical to (1) with the factor e~ rt included in R; i.e., if we replace R in (1) by R' — e~ r 'R. For computational reasons, it is easier to replace each /?,•(£ = 1, . . ., n) by R' = e~ rtln R'., which has the same effect as replacing R by /?'. With this last transformation the treatment of the two problems is identical and we shall restrict ourselves to the analysis of the first one. We shall analyze the properties of function (1) and indicate how this problem can be solved with the help of the methods described in [4J. 2. CONSISTENCY In order to analyze the function (1), we must define an important property. Let m denote the n- vector m u . ■ ., m„, and let us consider a subset S of indices among all the possible m. This subset S is said to be consistent with respect to the failure law of the component and the structure of the system if the following property holds: if for some and for some m, m' tS, t «£ T,R (m, t ) > > R (m' , t, then R(m, t' ) ^ R{m , t') for all t' ^ T. Consistency for a set S implies that the reliability orderings among all the vectors belonging to S remain constant between zero and T. By taking T sufficiently small and by taking an upperbound N on the number of components at each stage i, it is always possible to find consistent reliabilities. Indeed, if we have n stages, we have at most N" reliability functions. As this number is finite, it is always possible to find consistent reliabili- ties for [O, 7'], where 7' is smaller than the smallest positive intersection point of these N" functions. It seems impossible to provide general necessary and sufficient conditions to insure consistency on the whole set of reliabilities. However, if the reliability function at a given stage Ri(m, t) can be written as Pi{m)gi(t), then the reliabilities are consistent. On the other hand, it is easy to find nonconsistent reliability functions. Numerical tests show that loaded standbys with exponential failure laws do not show consistent reliabilities for general values of T. As another case, consider a two stage system with components in a loaded mode and having a linear failure law; i.e., the reliability of the system is: R= [1— (Xi7') m i] [1 — (kit)™ 2 ]. We shall show that for \i 3= A 2 , we can find two reliability curves which intersect each other. It is sufficient, for ex- ample, to take m= (wi], m 2 ) = (4, l)m' = (m[, m' 2 ) = (2, 2). T must be less than 1/Ai. For t > X2/A1, /?(m, l) is larger than R(m', t), but for t < A 2 /M, it is smaller. Therefore, for A.2/M < T< 1/A.i, the two reliability curves intersect each other and are inconsistent. For T smaller than A^/A,, they do not intersect and are consistent. For larger number of stages, it becomes very difficult (computationally) to see if the reliability curves intersect each other. Therefore, the problem exposed in this paper is more difficult than the one solved in [4]. For ex- ample, generally there is no sense in creating an undominated sequence of allocations at a given time as in [5]. However, if we compute an undominated sequence at a given time, and if the terms (i.e., the vectors m of this optimal sequence) are consistent for all t between and T for all possible se- quences, then we have the following theorem: ALLOCATION OF UNRELIABLE COMPONENTS 397 THEOREM I: If the set of vectors m of an undominated sequence at a given time, t, is consistent (i.e., if the undominated sequence remains the same for all t) then the optimal solution to (1) corresponds to a term of this undominated sequence. PROOF: Suppose the contrary and that the optimal solution is given for a vector m not belonging to the optimal sequence and with a cost of C. Consider the two successive terms in the optimal sequence of cost C and C" such that C < C «£ C". Then by definition of the optimal sequence and consistency, we have that the reliability R(t) of our solution is always smaller than the corresponding reliability R'(t) of the term costing C in the optimal sequence. Therefore, P J R(t)dt — C is smaller than P I R' (t)dt — C and we arrive at a contradiction. If the optimal sequence varies from one point in time to another, it is not true that the optimal solution to (1) always is a term in any optimal sequence. It can be a term which never appears in any optimal sequence at any time. Therefore, the use of optimal sequence is interesting only when they are identical at any time between zero and T. By extension, we shall call such an optimal sequence con- sistent.* As shown by our former example, it is very difficult to establish the consistency for two reliability curves even in the case of a very simple failure law. A fortiori, it is very difficult if not im- possible to establish such a property for a set, such as an optimal sequence, which cannot be formally defined mathematically. Therefore, the only way of checking the consistency of an undominated se- quence is to compute optimal sequences for different times and see if they are identical. If they are for a reasonable number of trials, it can be assumed that the undominated sequence is consistent. Naturally consistency on the optimal sequence is a far weaker restriction than consistency on the whole set of reliabilities. 3. PROPERTIES OF THE FUNCTION If the standbys are in a loaded mode and if the unreliability of a component at stage i is Pi(t), formula (1) becomes P ( T f[ {l-[pi(t)] m i}dt-^miCi = P [ T R(m,t)dt-C n by taking C= V mjC;. (see [4].) i=i For such a situation, we have the following properties. Proofs from previous chapters can be used to verify these properties. PROPERTY I: If the standby units at each stage are in a loaded mode, if Pi(t) ^ Pj(t) for all t between and T and if c t < cj, then at the optimal solution, the optimal number of components at stage i, m* is larger than or equal to the optimal number of components at stage /, m*. PROOF: See the proof of Theorem I in [3]. This property reduces the number of solutions we must consider when using a branch and bound technique. *This property implies more than consistency on the set of indices of the optimal sequence and less than consistency for all the possible indices. 398 C. G. HENIN r r PROPERTY II: If P increases, I Rdt and C are nondecreasing. PROOF: As the proof of Theorem VI in [4]. COROLLARY: If the undominated sequence at any time is consistent, then R and C are non- decreasing functions of P. This property and its corollary yield a lower bound on the cost which can be used if a solution to the problem is known for a certain value of P smaller than the present one. Consider variations in the duration of the mission T. Assume that R(t) and C are the reliability function and cost of the optimal solution for a duration T, and R' (t) and C the reliability function and cost of the optimal solution for a duration T'. For simplicity, notational purpose take Z(T) = ( T R(t)dt and Z(T')= T R' (t)dt. THEOREM II: If V > T, then Z{T) -Z(T) ^Z'(T') -Z'{T). PROOF: By hypothesis, the following relationships hold: PZ(T)-C^PZ'(T)-C PZ(T')-C^PZ'(T')-C These two inequalities imply that Z(T)-Z'(T) ^Z(T')-Z'(T') otZ'(T')-Z'(T) ^Z(T')-Z(T). COROLLARY: If the undominated sequence (at a given time) is consistent, then T' > T implies that R'^R and C 3* C. This theorem and its corollary show that as T increases, the corresponding optimal solution is either a more costly and more reliable one in the case of consistency of the undominated sequence or a solution such that / Rdt becomes greater between T and the new horizon if the undominated se- quence is not consistent. THEOREM III: If the undominated sequence at a given time is consistent and if P' 2= P, then at the optimal solution, m'* 3* m*. PROOF: Suppose that there are m* components at stage i with P and {m* — v) = m'.* with P' (v integer >0). Let R' and /?'' be the reliability on all stages, i excepted (i.e., RIRt and R' IR\) in the optimal solutions corresponding to P and P' , respectively. Similarly let C { and C be the cost of these (n — 1) stages in the corresponding solutions. PROOF: By the corollary of property I, R' (t) 5* R(t) at the corresponding optimal solutions. Thus, R'' > R\ Moreover, by hypothesis, (i) P rRiR'dt-O-mfct&P V R'iR* - O - (m* - V)a or P [* (/?, - R'JWdt 5* vet. Jo Jo Jo Similarly, ALLOCATION OF UNRELIABLE COMPONENTS 399 (ii) P' r RlR'Ut - C' - (m* - v) Ci ^ P' [ T RiR'idt - C' 1 - m* Ci or vc> s= P' T (/?, - R'JR'Ut. Grouping (i) and (ii), we get P f 7 {Ri-R'iWdt^P' ( T (Rj-RDR'tdt, J () Jo and,asP< P' and R t {t) 5* /?•(*) , we get R j >/?''. Hence, a contradiction. This theorem allows us to determine a minimum (maximum) number of components at each stage, if a solution is known for a smaller (larger) value of P and if the undominated sequences are consistent. Generally, the reliabilities R,(mi) are piecewise concave in m,. To assume such a property is not a drastic restriction. In the case of loaded standbys, it is always satisfied. In the case of unloaded stand- bys, this is not always true. However, if we consider for example, components with exponential failure laws with mean 1/A., the gain in reliability by passing from m units to (m+ 1) units is, for a one stage problem, (Kt) m e~ xt l(m\). This gain is a decreasing function of m for \t < m. Generally, problems will remain in such a situation because, if at the optimal solution m is less than \t, for a time t less than 7\ this implies that the reliability at the end is not very high. For m= 1, this implies a reliability smaller than e' 1 at time T* If Ri(mi) is piecewise concave for all i's, that our assumption holds, we have the following properties. Suppose that, for each stage, a maximum reliability attainable has been computed, i.e., at most Zi(t) the limit of /?,(m,, t) when m,- goes to infinity; this limit always exists because /?, is increasing in mi and bounded by 1. Let R' XJ (M being used for representing the maximum) be the corresponding maximum reliability on all the stages, i excepted. PROPERTY III: If the reliability at each stage is a concave function of the number of components at this stage, there exists an upper bound, A/,, on the number of components at each stage i such that r^minfybP f Mi=mm\k:P (/?,(£+ 1, t) -Ri(k, t))R[ 1 dt ^ C, . PROOF: The proof is analogous to that of Theorem V in [4]. Similarly, if we have a minimum number of components at each stage computed, and R) is the minimum reliability on all the stages, i excepted, the following property holds: PROPERTY IV: If the reliability at each stage is a concave function of the number of compo- nents at this stage, there exists a lower bound on the number of components at this stage, L,, such that Ls = max lk:P [ (R i (k,t)-R i {k-l,t))R i L dt^c]. These properties are also valid in the discounted case. A study of the asymptotic behavior of the function (1) yields the following propositions. *We will show later that the gain on the integral of/?,: i.e., {Ri (m,, t ) — Rt (mj — ],t))dt,Ka decreasing function of m, and therefore, the following properties are also fully applicable to this situation. 400 C. G. HENIN THEOREM IV: If P-» <», then at the optimal solution, m* tends to °° for all i. PROOF: mf tends to », because L< does. In order to have Li ^ N (N— arbitrary large), it is sufficient to take P^d J j T (Ri(N, t) -Rt(N-l, t)R[dt, where /?j (t) can be taken as R'(l, . . ., 1; t) to prove the theorem. When T tends to infinity, results are less sirriple. However, it is possible to give some results depending upon the integrability of the reliability functions. THEOREM V(a): If T = oo, and if there exists an index ;' , such that Z } {t) be integrable on [0, «>] , then at the optimal solution to (1), m* remains finite for all i. PROOF: We apply Lebesgue's theorem to the function Jo /(m,) converges to / = Zj(t)/?^ and is bounded by Zj (t)dt € J? i. Then 7(/(toj) ) e J£ \ and converges to/(/). Therefore, I(f(m.i)) — I(f(mi — 1)) converges to zero for m* going to infinity. From Property III, there exists a maximum number of components M, for stage i (»= 1, . . .., n). In the discounted case, it is easy to see that the above theorem is always applicable: it is sufficient to replace Ri by R' t — e~ rtln Rj to see that Z,(r) < e~ rtln is integrable on [0, »]. Therefore, in the dis- counted case, when the horizon is infinite, the optimal solution to the problem remains finite. THEOREM V(b): If none of the Z, belongs to JP U but if (3) lim | (R j (m j ,t)-Rj(m j -l,t))L>dt<C j IP-€, ' Jo for € positive number, then m* remains finite when T goes to innnity. PROOF: By contradiction, if we take m* 2* M, with M large enough for r (Rj(M, t) -Rj(M-l,t) )U dt < CjlP M we get M ^ Mj and thus a contradiction. For other situations, the asymptotic behavior of m* is unknown. The function R may admit at least two maxima, one at finite range and the other at infinity. A priori, it is impossible to determine which one of these maxima would be the global one. However, if the left hand side of (2) is > CjIP + e, making m,- going to infinity, increases the value of (1) to infinity. 4. AN EXAMPLE: UNLOADED STANDBYS A situation which may frequently arise is the case of unloaded standbys. In this situation the standbys units have no probability of failure until they are put in the system. This may frequently ALLOCATION OF UNRELIABLE COMPONENTS 401 happen when only one stage is used; i.e., a system consists of one component and spare parts which are introduced in the system when the main component fails. Let q(t) be the unreliability of one of these components, Q n (t) the unreliability of the system made of one basic unit and (n — 1) standbys and let A'„(t) be the first derivative with respect to t o(Q„(t). The unreliability of the system is given by the following expression [2]; Qn{t)= [' q{t-r)Q' n ^{T)dT forn>l Jo Q i (t) = q(t). Such a law can be computed or approximated in most cases. For an exponential failure law, this formula becomes (4) Qn(t) = l-^[(kt) k lkl]e- M , and (4) is easily integrable. The benefit of adding one unit to the system: i.e., the benefit of passing from n units to (n + 1) units is P I (\t)"lnle~ Kl dt — c (if c is the cost of a component of this one stage system). By solving the integral, this function becomes P[l-e- xr (l+ . . . + (\D"/n!]/ x -c. This function represents the profit of adding one unit to a one stage system with one basic unit and (n — 1) spare parts. It is decreasing in n and negative for n going to infinity. The solution n* to our prob- lem is therefore the smallest number n, such that the above expression is negative. In the case of other failure laws, the integrals are not that easy to solve. If the unreliability of the unit q{t) remains bounded between two linear functions of t : \t *£ q(t) «£ kt, the unreliability of the system is bounded between the unreliabilities of two components with exponential laws with parameters X and X. Therefore, the above formulas applied to X and X give approximate solutions to the problem. In the case of multi-stage problems, the above type of solution is not valid and the problem is more complex. However, for exponential failure laws, for example, all the terms in R, made of the products of expressions such as (4), can be integrated. If, for each stage, we compute a number nt, as above, this number is not the optimal solution, but an upper bound on the number of components at each stage because the profit of adding a component is not fT P I (R ( (mi,t)-Ri(mi-l,t))dt-Cu but (5) P f T (Ri(m i ,t)-Ri(m i -l,t))R i dt-c t . Jo 402 < : - G - HENIN If a lower bound is known on the number of components at each stage, (5) can be used, by replacing /?' by/? 1 to generate new lower bounds on the number of components at each stage. 5. COMPUTATIONAL PROCESSES The former theorems, enable us to parametrize the problem as soon as an optimal solution is found for a value of P and T. However, the main difficulty remains the determination of such an optimal solution. We propose the following algorithm (in the case of/?, or I Ridt concave in m*) :* Jo (a) Compute the limit for m going to infinity of R, ; (m, t) (i= 1, . . .,n). (b) Compute a maximum number of components at each stage (Property III). Stop the process when no further improvement is possible. Similarly, compute also the minimum number of components at each stage. If the minimum and maximum number of components at each stage coincide for each stage, stop. This is the solution. Otherwise, go to the next step. (c) Select different times and compute the undominated sequence for each of these durations, the number of components at each stage remaining between the bounds determined at the former step. If these optimal sequences are identical, assume that the optimal sequence remains consistent (even if the system is not consistent, the solution will be very near the optimal one) and go to step d. If the optimal sequences remain identical, except for costs between a and b, assume that the optimal sequence is consistent for costs smaller than a and larger than b and go to step/. If the optimal sequences do not satisfy one of the two categories above, go to step e. ** (d) Compute the value of (1) for all the terms of the optimal sequence and terminate by taking the terms giving the largest value of (1). This is the optimal solution (or a solution very close to it). (e) In this case, apply the branch and bound technique described in [3] and [4]. There are no funda- mental, theoretical difficulties for its applications to the present problem; however, at each step, the integral of the reliability of the current solution considered in the algorithm must be computed, which lengthens the computations. The initial solution and lower bound on the function is either (0, . . ., 0) or (Mi, . . ., M„), whichever one gives the largest value of (1). (f) Compute the value of (1) for all the terms of the optimal sequence for costs smaller than a or larger than b. Take the term H among them which gives the largest value of (1). Then apply the branch and bound algorithm, as in step (3), with the following restrictions: all the complete solutions must have a cost between a and b and the initial solution and lower bound are H and the corresponding value of (1). The solution given by this algorithm is the optimal (or very close to the optimal) solution of the problem. NOTE: It does not seem possible to use an approximate solution as we did in [4], because the can- cellation of the first derivatives of the objective function does not give a solvable system of equations as in [1]. In some cases, however, (1) is also a concave function of the rrij. From [4], this is certainly true in the region of m;'s such that ^(i-/?,(^,r))^i, i = l *As mentioned before, in the discounted case, Ri is replaced by e r,ln Ru ** In the case where the undominated sequence would remain consistent except on some intervals [aj, . . ., bj\, it would be possible to extend the method described in/. ALLOCATION OF UNRELIABLE COMPONENTS 403 or everywhere if we have loaded standbys. If (M%, . . ., M„) satisfy such a condition, it seems possible to find a local maximum (which would probably be a global one too) by applying the same routine de- scribed in [4]. Now, if this solution did not satisfy (4), we would have to show that the function (1) is still concave at that point or forget it and apply the algorithm described above. REFERENCES [1] Fan, L. T., C. S. Wang, F. A. Tillman, and Huang, "Optimization of System Reliability by the Discrete Maximum Principle," IEEE Transactions on Reliability, lb (Sept. 1967). [2] Gnedenko, B. V., Y. K. Belyayev, and A. D. Solovyev, Mathematical Methods of Reliability Theory (Academic Press, 1969). [3] Henin, C. G., "An Algorithm for Maximizing Reliability Through System Redundancy," Carnegie- Mellon University, Management Sciences Report #216 (Aug. 1970). [4] Henin, C. G., "Computational Techniques for Optimizing Systems with Standby Redundancy," Nav. Res. Log. Quart. 19, 293-308 (June 1972). [5] Ketelle, J. D., "Least Cost Allocation of Reliability Investment," Operations Research, 10, 249-265 (1962). A CONTINUOUS SUBMARINE VERSUS SUBMARINE GAME Eric Langford* University of Maine Orono, Maine ABSTRACT This paper analyzes, from a game-theoretic standpoint, the simultaneous choice of speeds by a transitor and by an SSK which patrols back and forth perpendicular to I Ik transitor's course. Using idealized acoustic assumptions and a cookie-cutter detection model which ignores counterdetection, we are able to present the problem as a continuous game, and to determine an analytic solution. The results indicate that with these assumptions there are conditions under which neither a "go fast" nor a "go slow" strategy is optimal. The game provides a good example of a continuous game with a nontrivial solution which can be solved effectively. INTRODUCTION This paper analyzes from a game-theoretic standpoint the simultaneous choice of speeds by a transitor and by an SSK which patrols back and forth perpendicular to the transitor's ( nurse. (An SSK is a submarine whose mission is directed against enemy submarines.) The payoff, in effect , is taken to be the SSK's detection sweep width. This game was originally investigated by D. H. Wagner and E. P. Loane in classified reports during 1963-64. Their treatment was confined to choices of .speeds from a discrete set, but applied to rather general acoustic conditions. The present analysis assumes an ideal- ized form of propagation loss versus range and of noise versus speed. These idealizations permit the sweep width to be expressed as a convenient continuous function of the two speeds; accordingly, they allow each player to make choices from a continuum of speeds, rather than from a discrete set. They also permit a comprehensive analysis of the variety of cases which can arise within the idealizations. Subsequent to Wagner and Loane's work, an approach to this game was undertaken by Mathe- matica [6], treating a continuous analytic payoff function based on idealized acoustics. Unfortunately, inconsistent acoustic assumptions were used in [6]; these were corrected by Mathematica in a subse- quent report [1\ Motivated by [6] (and prior to [1]), the present analysis was undertaken, again using a continuous payoff function, but using acoustic assumptions generally felt to be consistent. In this paper, we assume that the graph of noise versus speed is linear above a breakpoint speed below which noise is independent of speed; this is a common assumption and was made in [6]. We also make the usual "spreading law" assumption, i.e., that propagation loss is proportional to k'th power of range when loss is expressed in power units. This of course is equivalent to the assumption that propagation loss is proportional to the logarithm of the (k'th power of) range when loss is ex- pressed in decibels. (The inconsistency in [6] appears to be on this point.) We take for a payoff function what is essentially the SSK's kinematicaUy enhanced sweep width; the SSK. of ionise, attempts to *Research on this paper was performed when the author was with the Naval Postgraduate School and Daniel H. Wagner, Associates. 405 406 E. LANGFORD maximize this quantity and the transitor attempts to minimize it. The SSK will thus be maximizing his detection probability if we assume a cookie-cutter detection model, i.e., that detection occurs when and only when the signal excess (assumed deterministic) reaches a threshold value. We ignore counter- detection by the transitor. This will be a realistic assumption if the SS/Ts acoustic capability is far superior to that of the transitor. The more general problem, which takes into account counterdetec- tion and which allows for a stochastic detection model, appears to be quite difficult to solve within this framework. However, in a discretized form, it was handled satisfactorily by Wagner and Loane in the above-mentioned reports. The game is described in abstract terms in the first section, i.e., the payoff function is given in a form which is equivalent (for purposes of the game's analysis) to the formula for theSSK's kinematically enhanced sweep width. (The SSK's effective sweep width is increased by his back and forth patrol.) The properties of the payoff function are developed into geometric criteria for solution. In the second section, graphical methods of finding the solution are given and illustrated by examples. A numerical solution is described in the third section. The fourth section enumerates the possible outcomes as combinations of the pure and mixed strategies. Identification of the abstract game with the real SSK versus transitor game is given in the final section. In an earlier version, this paper was submitted as a Memorandum to Commander, Submarine Development Group Two, in New London [4]. In this earlier version are included graphs of all possible cases and a Fortran computer program to solve the game. A subsequent classified memorandum applied the analysis to some "real life" numbers. DESCRIPTION OF THE GAME We consider the following game. The maximizing player (SSK) chooses a speed u in the range *£ "min < « ^ "max* and the minimizing player (transitor) chooses a speed v in the range < v min ^ v *£ t> max - The payoff function is F(u,v) = e cv - U V l + u 2 / v 2 , where c > is a constant. The explicit identification of this payoff function with the SSK's detection sweep width will be clarified later. For the time being, we treat the game abstractly. Computing the partial derivative with respect to v, we see that F 2 {u, v) = F(u, v)\c " . L v{u 2 + v 2 )\ The second partial derivative with respect to v is p ( \ ~i \\\ u 2 I 2 | u 2 (u 2 + Sv*) } F 22 (u, v) = , (a, v) |[c - v ( u% + yi) J + ^ + ,,2)']. so that F22 >0. That is, F is convex in its second argument; it is well-known that this implies the following facts (see [2, p. 80] or [5, p. 259]): 1. The minimizing player (transistor) always has an optimal pure strategy. CONTINUOUS SUBMARINE GAME 407 2. The maximizing player (SSK) always has an optimal mixed strategy involving at most two speed choices; i.e., he either has an optimal pure strategy, or an optimal mix of two speeds. Let us fix v and consider F as a function of the one variable u. Since F is continuous and restricted to a closed bounded interval, it assumes a maximum value; this maximum must be taken on at an endpoint (i.e., either u min or u max ) or at an interior relative maximum, which can, in this case, be located by differentiation. To this end, we form the partial derivative of F with respect to u, and equate it to zero: Fi(u, v) = F(u, v) a 2 + v 2 J 0. Now F never vanishes, so that Fi (u, v) — iff u = u 2 + v 2 . The solution set to this equation in the u — v plane is the semicircle {(u,v): (u - 1/2) 2 + v 2 = 1/4, and v > 0}. From the form of Fi, it is evident that F\{u, v) > if (u, v) is inside the semicircle, and that F\(u, v) < if (u, v) is outside the semicircle. Thus the graph of F(u, v) versus u for fixed v can have three qualitatively different forms as shown in Figure 1. The case v= 0.6 is typical of v > 1/2 and the case v — 0.4 is typical of v < 1/2. 1.8 - 1.7 - "3LI6 -* 1.4 - 1.3 v=0 6 STRICTLY DECREASING v=0 5 INFLECTION POINT v =0.4 1.0 Figure 1. Qualitatively different forms off. 408 E. LANGFORD Let us define the function <p as follows: ip(v) = max {F(u, v): u min ^"^« max }- From the above observations, it is not difficult to infer the following: (1) If v 2* 1/2, then <p(v) = F(u min , v). (2) If v < 1/2, then (1) (p(v) = max {F(u min , v), F(u max , v), F(u , v)} , where u = V2 + VV4 — v 2 , if this falls within the interval u min ^ u ^ u max ; otherwise we ignore u . Since F is convex in its second argument, v, it follows that <p is continuous. Its minimum value will be the value of the game; moreover, if <p assumes its minimum at v— v 0< then ^0 is an optimal pure strategy for the minimizing player. By the convexity of F, ip is unimodal (hence the above vo is defined uniquely). Since <pis unimodal, we can locate its minimum numerically with great efficiency using a binary search; graphical methods are also possible. Similarly, we can fix u and solve the equation Ft{u, v) =0: F 2 {u,v)=F(u,v)[c- 1 ^^ = 0. v 2 ). The solution set to this equation is {(u,v):cv i + cu>v=u 2 , (u,v) # (0,0)}. By the Cardan-Tartaglia formula, this defines vasa function of u as follows: v/2c V4c 2 ^27^W2c V4c 2 ^27 Conversely, if v < 1/c, we can solve for u as a function of v : I CV* Figure 2 shows the graphs of F l (u, v) = and F 2 (", v) = for several values of c. The intersection of the graphs of F\ (u, v) = and F 2 (u, v) = will give a saddle point if u > 1/2, i.e., if c > 1. If there is a pure strategy minimax solution to the game which is interior for both players (i.e., neither u mln nor u max for the SSK and neither v min nor v max for the transitor), then the solution must occur at this saddle point. (Edge minimaxes need not be of this form.) To find this saddle point, we solve the following equations simultaneously: F,(u, v) = F 2 (u, v) = 0. CONTINUOUS SUBMARINE GAME 409 c = 05 c=0.75 c = I.O c=l.5 c=20 >F 2 (u,v)=0 1 2 3 0.4 0.5 0.6 0.7 8 09 10 SSK's SPEED u Figure 2. Graphs of F t (u, v)=0 and of F 2 (u, v)=0 for several values of c. Strangely enough, the answer is exceedingly simple: u = c 2 +l v = c 2 +l Note that there can be no interior minimax if 1. GRAPHICAL SOLUTION OF THE GAME Define the function/as follows: For any uo, u min n "o =£ u max , we define vo=/(«o) iff F(u , vo) = min {F(u , v): v min =£ v =s fmax}- That is, for any admissible «,/(«) is the v which minimizes F. By the v-convexity of F, f is a well-defined, single-valued function. Moreover, / is continuous and monotone increasing. In fact, if we stay away from v min and tt max >/is strictly increasing; more precisely, if U\ < u% are such that v min < /(«i) =£ f{uz) < v max , then f is strictly increasing on the interval [u\, u 2 ]. See Figures 5, 6, and 7 for examples. Actually, we can give a simple formula for/. Let h(u) be the solution to F 2 (u, h(u)) = 0, i.e. \2c V 4c"- 27 ^ \ 2c V4c 2 + 27- 410 E. LANGFORD For unrestricted v, F(u , v) is convex and has an absolute minimum at v=h(u»). Therefore if u min *£ h{u ) =£ f max , then /(u ) = h(u ). If h(u ) 2* f max , then /(u ) = f max , and if h(u ) *£ v min , then f(uo) = fmin- We can summarize these cases as follows: /(u„) = min {v max , max [v min , h(u„)]}. It is straightforward to determine /(«o) graphically. Let «o be fixed and consider the line segment S={(u, v): u = u and v min =£ v ^ v max }- The three possibilities, namely /(uo) = v min ,f(uo) = f max , and/(u ) = h{u ) can be exhibited graphi- cally as follows: (1) If the line S lies completely above the graph of h{u) versus u, then/(u ) = t>mnv (2) If the line S lies completely below the graph of h(u) versus u, then/(« ( >) = i> max . (3) If the line S intersects the graph of h(u) versus u at («o, Vo), then/(u ) = vo = h(u ). These three possibilities are graphed in Figure 3. 07 1 0.6 Q LU Ul a. OT 0.5 S IN CASE 3 S IN CASE 1 S^ s IN CASE 2 (A O m 04 < or h- 03 02 / 1 / i i i i i i 1 1 _ i I 0.1 0.2 0.3 04 05 0.6 07 0.8 09 10 I.I SSK's SPEED u- FlGURE 3. Three possibilities in the determination of f(u); (c = 1 in this example). In a similar fashion we define the "function" g as follows: For any vo, v mit Vo ^max? w e define Uo — g(vo) iff F(u , vo) — max {F(u, vo): "min ^ u ^ "max) = <p(vo)- From Figure 1, it is clear that this need not, in general, define a single-valued function. We shall subsequently show that either g is a continuous single-valued function or that the graph of g consists of two continuous pieces which over- lap only at an endpoint. More precisely, in this second circumstance, there exists a point t>n such that Vmin ** vo *£ v max , and such that g restricted to either of the sub-intervals [v min , v ) or (vo, fmax] is a continuous single-valued function; however at v = vo, g(v) is bivalent; it has the two values g(vo — 0) and g{vo + 0). (It will also be shown later that this right-hand limit can only be u m in a °d that the left- CONTINUOUS SUBMARINE GAME 411 hand limit can be either u max or u = 1/2 + Vl/4 — v%. ) In either case, g is monotone decreasing: g(v\) ^ g(v 2 ) whenever v\ ^ v 2 . Figure 5 illustrates the first possibility, while Figures 6 and 7 illustrate the second. Note that when we fix v and regard F(u, v) as a function only of u, the constant c enters the ex- pression for F(u, v) only as a constant multiplier. Since we are interested in locating the maximizing u for fixed v , it follows that the value of c is unimportant in the discussion which follows. That is, the maximizing u is located in the same place no matter what value c has. SSK's SPEED u-»- 2 3 4 5 6 0.7 0.8 0.9 1.0 Figure 4. Six possibilities in the determination of g(v). We refer to Figure 4: the semicircle ABC is the locus of Fi(u, v) = 0. The points on the quarter- circle AB are relative minima for F(u, v) considered as a function of u for fixed v, and the points on the quarter-circle BC are relative maxima. The point B itself is an inflection point (cf. Figure 1). The curve DB is defined as follows: if (iti, vo) lies on DB, then F(ui, vo)—F(u 2 , vo), where (u 2 , v ) lies on the quarter-circle BC. The curve DB is thus obtained as the solution set of the following transcen- dental equation F(u, v) = F(l/2 + Vl/4 - »*, v), where < u «£ 1/2. Let Vo be fixed and let S' denote the line segment {(u, t>):u min s£ u=s£u max and v = v }. There are six possibilities: (1) The line S' lies completely within ABC; then g(vo) = "max- The line S' may meet, but not cross ABC. (2) The fine S' lies completely outside of ABC; then g(vo) = u m in- The line S' may meet, but not cross ABC. 412 E. LANGFORD (3) The line S' crosses AB, but does not meet or cross DB. In this case, g(vo) may be « m j n , "max, or both, depending on the relative sizes of F(u min , vo) and F(tt max , vo)- (4) The hne S' crosses DB; then g(vo) = u min . (5) The line S' meets or crosses BC, but does not meet DB. In this case, the maximizing Uo occurs at the intersection of S' and BC. Evidently g(vo) = «o~ Va + VV4— vjj. (6) The line S' meets, but does not cross OB. In this case, g(vo) = u min , unless S' also meets or crosses BC. In this latter circumstance, g(vo) has the two values « m in and uo, where Uo is at the inter- section of S' and BC as in the case above. These possible cases are all graphed in Figure 4. If we graph g(v ) versus v, the graph will be a nice continuous curve as long as g(v) "stays put", i.e. as long as g(v ) is one or the other of the endpoints or is uo. Problems arise at the two "transition stages": A. As in Case 3, when F(u min , vo) = ^("max, vo) = <p(vo). This will be called a type A transition. B. As in Case 6, when F(u min , vo) — F(uo, vo) = (f(vo). This will be called a type B transition. As v increases through a transition stage Vo, g(v) will make an abrupt jump. Precisely at the transition stage v — to, g(v) will be two-valued. It will now be shown that as v increases through the transition stage Vo, one of the following two things will happen: A. If Vo is a type A transition, then#(^o) must jump from « max to u min . B. If vo is a type/? transition, theng(vo) must jump from Uo tou min . It is a consequence of the above that g can have at most one transition stage, since any transition puts g(v) in the "absorbing state" "mm- An example of a type A transition is found in Figure 7, and an example of a type B transition is found in Figure 6. To prove the above assertion, let us suppose that «i < «2 and that for some Vo,F{u x , vo) —F{u-i, vo)- If we define the function G(v) = F(u\, v) — F(« 2 , v) , then by assumption, G(vo) = 0. By computing the (continuous) derivative C (v) , it is easy to show that G' (vo) > 0, so that G is increasing in a neighbor- hood of i>o; in other words, as v increases through fo, F(u\, v) is first less thanf(u2, v ) and then greater, so that a transition can occur from a larger value of u to a smaller value, never the other way. The "function" g is monotone decreasing since it decreases at its transition point (if there is one) and since it obviously must decrease at points other than transition points. As is the case with /, if V\ < t>2 are such that u min < ^C*^) ^ g^i) < "max, then g is strictly decreasing on the interval [vi,v 2 ]. If we plot both /and g within the rectangle of admissible speeds defined by {(a, v) : u min =S u *£ «, Ililx and v min ^v^ t> max } , then one of two things will occur: (1) The two graphs will intersect at a single point («o, vo). In this case, «o is an optimal pure strategy for the maximizing player (SSK) and t>o is an optimal pure strategy for the minimizing player (transitor). See Figure 5 for an example of this case. (2) The two graphs will not intersect. More precisely, there will exist a transition point fo such that g(v ) has the two values «i and "2, where Ui < u> and where «i <f~ l (vo) < u 2 . See Figures 6 and 7 for examples of this. Note that/" 1 is defined at Vo since/is strictly increasing as it passes through CONTINUOUS SUBMARINE CAME 413 TRANSITOR's MINIMUM SPEED IS 005 AND MAXIMUM SPEED IS 070 SSK's MINIMUM SPEED IS 0.50 AND MAXIMUM SPEED IS 095 OPTIMAL PURE STRATEGY FOR TRANSITOR IS 04000 OPTIMAL PURE STRATEGY FOR SSK IS 8000 c = 2.0 071- 0.5 - SSK's SPEED u- 0.2 0.3 04 05 0.6 07 0.8 0.9 10 Figure 5. Graphical solution of the game — Example 1. TRANSITOR's MINIMUM SPEED IS 25 AND MAXIMUM SPEED IS \ 60 SSK's MINIMUM SPEED IS 15 AND MAXIMUM SPEED IS 90 \ c = 10 OPTIMAL PURE STRATEGY FOR TRANSITOR IS 4591 OPTIMAL MIXED STRATEGY FOR SSK 0.1500 A FRACTION 3972 0.6981 A FRACTION 0.6028 i i i i 1 1 g SSK's SPEED u-»- 2 0.3 4 0.5 6 07 8 9 Figure 6. Graphical solution of the game— Example 2. 10 the hole in the graph of g. In this case, the minimizing player again has the optimal pure strategy Vo, but the maximizing player now has an optimal mixed strategy given by u, with probability p and u 2 with probability 1 — p. The constant p is found by solving the following equation: P'Fa(ui, vo) + (1 ~p) • F 2 (u>, t> ) = 0. 414 E. LANGFORD 07 TRANSITOR's MINIMUM SPEED IS 0.15 AND MAXIMUM SPEED IS 0.40 SSK's MINIMUM SPEED IS 0.00 AND MAXIMUM SPEED IS 20 OPTIMAL PURE STRATEGY FOR TRANSITOR IS 0.2852 OPTIMAL MIXED STRATEGY FOR SSK: 0000 A FRACTION 1350 2000 A FRACTION 8650 c=IO SSK's SPEED u — 2 0.3 4 5 6 7 8 0.9 Figure 7. Graphical solution of the game — Example 3. No other possibilities (e.g., two intersections) can arise by the aforementioned monotonicity properties of/andg. Three numerical examples of this graphical solution are given to make this more clear. In each of these figures, the graph of/ is indicated by dash line and the graph of g is indicated by a dot-dash line. The box is the rectangle of admissible speeds defined above. NUMERICAL SOLUTION OF THE GAME The following is a step-by-step procedure for solving the game. It essentially repeats the graph- ical procedure, but from a point of view which emphasizes suitability for computation. This procedure has been programmed for the GE-235 computer: a copy of the program and sample output are included in [4]. The following step-by-step procedure can be considered a macroscopic flow chart of the com- puter program. (1) Locate the minimum of <p(v ) = max,, F(u, v). Since <p is unimodal, this can be done easily by iterative computation using a binary search. The function (p(v) itself is evaluated by using formula (1) derived earlier. The minimum value of <p(v) is the value of the game. (2) Let vo be that unique number such that <f>(vo) = min t , <p(v). This is obtained automatically as a by-product of step 1. Then v is the optimal pure strategy for the minimizing player (transitor). (3) Solve the equation F(u, vo) = <p{vo) for u by checking the three possible places where the max- imum could occur (viz. u mln , u max , and uo= V2+ VV4 — t^). If there is a unique solution «*, then u* is an optimal pure strategy for the maximizing player (SSK). If there are two distinct solutions U\ < U2, then a mix of Ui with probability pand u 2 with probability 1 — p will be optimal. The number p is the solution to the equation p-F 2 (u u vo) + (1-p) •F 2 (u 2 ,vo)=0. CONTINUOUS SUBMARINE GAME 415 ENUMERATION OF POSSIBLE OUTCOMES From an examination of Figure 1 or otherwise, we see that there are five qualitatively distinct possibilities for the maximizing player: U(l). A pure strategy of u min . U(2). A pure strategy of u max . U(S). A pure strategy of Uo, where u mln < u < u max . C/(4). A mixed strategy of u min and « max . U(5). A mixed strategy of u min and Uo, where u min < Uo < u max . Since the minimizing player always has a pure strategy, there are only three possibilities for him: V(l). A pure strategy of v min . V(2). A pure strategy of v mi) V(S). A pure strategy of vo, where v min < Vo < v max . Apparently there are 15 cases to consider. However it is known [5, p. 267] that in cases V{\) and V(2), the maximizing player must also have a pure strategy. (This can be inferred also from the geometric reasoning previously used.) Thus there are only 11 possible cases. In [4], examples are given of each of these 11 cases. We remark that for certain values of c, some cases are forbidden. For example, if c ^ 1, then the case U(3) — V(S) cannot occur. IDENTIFICATION OF ABSTRACT GAME WITH REAL GAME We shall now identify the foregoing abstract game with a real SSK versus transitor game. The SSK moves back and forth at constant speed u across a rectangular barrier zone. The transitor enters the zone on a course perpendicular to the SSK's course at a constant speed v' . The payoff for the SSK is detection sweep width. He attempts to maximize this quantity and the transistor attempts to minimize it. Let L s {v') denote the radiated noise (in decibels measured at 1 yard from the source) of the transitor as a function of its speed v' , and let L N (u') denote the self noise (in decibels) of the SSK as a function of its speed u' . We assume that the noise curves of both SSK and transitor are of the form shown in Figures 8 and 9, respectively. L N (u') u'min SSK's SPEED u' Figure 8. Noise curve for SSK. RADIATED NOISE (db) I min -f U S" _ _L Ls(v') v'min TRANSITOR's SPEED v' Figure 9. Noise curve for transitor. It is obvious that the SSK will never travel more slowly than u' min and that the transitor will never travel more slowly than v' min . We therefore assume that L N (u') and £<?(*/) are both linear for all u' and v , but in the analysis we will not consider any speeds less than these minimum speeds. The 416 E. LANGFORD maximum speeds, u' max ana " "inaxt ar e of course determined by the physical characteristics of the respective submarines. We have then the following formulas for L.v(u') and L s (v'): L N (u') = L™ i »+b(u'-u' min ) L s (v')=Lf"+a(v'-v' min ), where a and b are the slopes, measured in decibels/knot. Assume that propagation loss obeys a spreading law, so that the decibel loss in propagating from 1 to R yd is k logio R, for some fixed k > 0. The unenhanced sweep width W for the SSK is given by k\ogn>(JPI2)=L s (v')-L N (u')+N DI -NRp, where N D i and Nrd are the SSATs directivity index and recognition differential. (Of course all terrr in this equation are taken at the frequency and bandwidth of interest.) If we solve this equation for W, we obtain W = 2 exp {I (log 10) [L s W ) - L N (a' )+N Dl - N RD ] } = 2 exp || (log 10) [Lf n + av' - av' min - Lf n - bu' + bu' min + Ndi -N RD ]\ = 2exp{!(loglO)[L- i "-a U ' mln -L^^ If we make the following substitutions: u=(| logloW *=(flogl0)*' K=2 exp{| (\oglO)[LF"-L% in -av' min + bu' mn + N DI -N RI >]} c= v then W is of the form W=Ke cv ~ u , where K and c are positive constants. CONTINUOUS SUBMARINE GAME 417 As noted by Wagner and Loane and by Mathematica, the kinematic enhancement of sweep width in the back and forth patrol is a multiplicative increase in the approximate amount Vl+ (u'/v') 2 , providing the SSK's patrol legs are substantially longer than the acoustic sweep width W . (See [3, Equation (7.1.4.)].) Since the kinematic enhancement factor depends only on the ratio u'/v' = u/v, the results of the previous analysis apply. Here the primed variables, namely «' and v', refer to true ship speeds (in knots). The unprimed variables, namely u and v, refer to the "normalized speeds" as considered in the solution to the game. All graphs, etc. are referred to these normalized speeds. ACKNOWLEDGMENTS I would like to thank Dr. Daniel H. Wagner for his help in the preparation of this paper. The work was originally supported by ONR Contract Nonr-4784(00). The writing of this paper was supported by an ONR Foundation Grant (FY1968). REFERENCES [1] Agin, Norman I., et aL, "The Application of Game Theory to ASW Detection Problems," Mathe- matica Report, Princeton, New Jersey (Sept. 30, 1967). [2] Karlin, S., Mathematical Methods and Theory in Games, Programming, and Economics (Addison- Wesley, Reading, Mass., 1959). [3] Koopman, B. O., "Search and Screening," OEG Report No. 56, Operations Evaluation Group, Office of the Chief of Naval Operations, Navy Department (1946). [4] Langford, E. S., "Game — Theoretic Analysis of Choice of Speeds by SSK and Transitor," Daniel H. Wagner Associates Memorandum to CSDG-2 (Nov. 17, 1966). [5] McKinsey , J. , Introduction to the Theory of Games (McGraw-Hill, New York, 1952). [6] "A Study of Optimal Patrol and Transit Strategies in a Rectangular Barrier Zone Using Mathe- matical Games," Mathematica Report, Princeton, New Jersey. TOTAL OPTIMALITY OF INCREMENTALLY OPTIMAL ALLOCATIONS* Lawrence D. Stone Daniel H. Wagner, Associates Paoli, Pennsylvania ABSTRACT This paper considers the problem of finding optimal solutions to a class of separable constrained extremal problems involving nonlinear functionals. The results are proved for rather general situations, but they may be easily stated for the case of search for a stationary object whose a priori location distribution is given by a density function on R, a subset of Euclidean rc-space. The functional to be optimized in this case is the probability of detection and the constraint is on the amount of effort to be used. Suppose that a search of the above type is conducted in such a manner as to produce the maximum increase in probability of detection for each increment of effort added to the search. Then under very weak assumptions, it is proven that this search will produce an opti- mal allocation of the total effort involved. Under some additional assumptions, it is shown that any amount of search effort may be allocated in an optimal fashion. 1. INTRODUCTION In this paper we consider the relationship between incrementally optimal allocations and totally optimal allocations. Motivation for studying this relationship arises naturally in planning a search for a lost object. Suppose that the search planner is given authorization to search for a fixed time interval, and he conducts the search to produce the maximum probability of detection at the end of the interval. If the search fails to detect the lost object within the allotted time, the planner may be given authoriza- tion to continue searching for an additional time increment. In this case the planner may allocate the additional search effort to maximize the probability of detection in the given increment. Having done this, one may ask whether the search could have produced a higher detection probability if it were known in advance that both the initial time interval and the added increment were available. In mathematical terms the search problem is to allocate optimally a given amount of effort in order to detect a stationary object, the target, located in Euclidean ra-space, R. There is a function / which gives the probability density of the target's location. Suppose T is the amount of effort available for the search. Then the search planner seeks a function q* :R—* [0, ») such that I q*(x)dx *£ T and JR (1.1) (b(x,q*(x))f(x)dx=maxU i b(x,q(x))f(x)dx:q^O,j i q(x)dx^Ty The function b(x, •) is the local effectiveness function at x. That is, b(x, y) gives the conditional prob- ability of detecting the target given it is located at x and the effort density is y at x. The integral on the left of (1.1) gives the probability of detecting the target when using allocation q*. The function q* is called an optimal allocation. This problem has an obvious analog when R is replaced by a countable set of locations or boxes. This research was supported by the Naval Analysis Programs, Office of Naval Research, under Contract No. N00014- 69-C-0435. 419 420 L- D. STONE For the case where b(x, y) — \—e v for xeR and y^O, Koopman [4, p. 617] made the following observation. Suppose one allocates T\ amount of effort in an optimal fashion, but fails to detect the target. An increment T t of effort then becomes available. If one allocates this additional effort in an incrementally optimal manner (i.e., optimal considering the previous allocation of T\ amount of effort), then one obtains an optimal allocation of T\ + T 2 effort. That is, two incrementally optimal allocations produce a totally optimal allocation. In [2] an incomplete attempt was made to show that incrementally optimal allocations produce totally optimal allocations provided that db(x, y)/dy is a positive monotonic nonincreasing function of y for xeR. In section 2 of this paper we show that for any Borel measurable local effectiveness function, incrementally optimal allocations are totally optimal whenever the target's probability distri- bution is given by a density function as in (1.1). In the case where the search space is countable, we prove that concavity of the local effectiveness function guarantees that incrementally optimal alloca- tions are totally optimal. In addition, it is shown by counterexample that this property need not hold for countable search spaces if the local effectiveness function is not concave. A search plan is called uniformly optimal if it maximizes the probability of detection at each instant during the search. In section 3, we show the existence of uniformly optimal search plans under addi- tional hypotheses which are given there. Our results hold in a more general situation than that of search theory. Thus, we introduce the following framework which is substantially the same as that in [6], one difference being that we deal only with Borel functions. Let R be a Borel subset of Euclidean rc-space. We fix Borel functions L and U with L =£ U which are defined on R. The functions L and U may take infinite values. Define fl={(x, y) :xeR, \y\ < °° and L(x) ^y^U(x)}. We fix a real-valued Borel function e defined on ft and the family H of a.e. (with respect to Lebesgue measure) real-valued Borel functions q defined on R such that L^q^U . For qes. we understand e(-, q(-)) to be a function from R to the reals. Define 4> = En{(7 : e(-, <?(•)) and q are integrable}, and let E{q) — I e(x, q(x))dx and C{q) = j q{x)dx for ge<P Jn J* All integration is Lebesgue integration. A q*e<f> is said to be optimal if E (q* ) = max{£ (q) : 9 e<D and C(q) = C(q*)} . In the case where L(jc) = 0, £/(*) = °° for xeR and e(x, y)=f(x)b{x, y) for (x, y)eft, E(q) be- comes the probability of detecting the target with allocation q and C{q) becomes the amount of effort required by q. Then an optimal q* maximizes the probability of detection which can be obtained with effort C(q*). A function /defined on the real line is said to be increasing if y 3* x implies/(y) ^f(x). A function /is said to be concave if for all x, y in the domain of /, f(ax+ (1 — a)y)) 3* ctf(x) + (1 — ot)f(y) for Ossa^l. 2. INCREMENTAL OPTIMIZATION For i = l, 2, . . .. let <7,-€<I> be such that qi =£ q 2 *£ . . . . Let qo = L. If INCREMENTALLY OPTIMAL ALLOCATIONS 421 E(qi) = max {E(q) : q ^ q\-\, qe<l> and C{q) = C (<?,)} for i= 1,2,. . ., then we say that (q t , q 2 , . ■ .) is an incrementally optimal sequence. If g, satisfies E(qi) = max {E(q) : qe<P and C(q) = C(q t )} for i= 1,2,. . ., then (<7i, q 2 , . . .) is said to be a totally optimal sequence. Define f(x, y, A) = e(x, y) — Ay for — °° < A < oo, and (x, y)eft. The function € is called a pointwise Lagrangian in [6] and A is a Lagrange multiplier. THEOREM2.1: Let (<7i,g 2 , . . .) be an incrementally optimal sequence such that for i— 1,2, . . . |£(4i)| < °° and C(qi) is in the interior of the range of C. Then (<?i, q 2 ■ . .) is a totally optimal sequence. PROOF: By the definition of incremental optimality, q x is optimal. Thus, by Corollary 5.2 of [6], there exists a real number Ai such that for a.e. xeR (2.1) /(*,<?,(*), A,) 3= /(*,y, A,) for \y\<°o and L{x)^y^U{x). In other words a necessary condition for qi to be optimal is that it maximize a pointwise Lagrangian for some multiplier A*. Similarly, the incrementally optimal nature of q 2 implies the existence of a real number A 2 such that for a.e. xeR (2.2) <f(x,q 2 (x),X 2 )^f(x,y,\ 2 ) for \y\ < » and qi (x) *S y =S U(x). In order to prove that q 2 is optimal it is sufficient to find a real number A such that for a.e. xeR (2.3) f{x, q 2 (x), A) &S(x, y, A) for \y\ < oo and L(x) =£ y ^ U(x). The sufficiency of (2.3) follows from a well known result concerning Lagrange multipliers (see, for example [3], [8] or Theorem 2.1 of [6]). By (2.1) and (2.2) (2.4) k 2 (q 2 (x)-q l (x)) =£e(*, q 2 {x))-e{x, q^x)) =£ A, (q 2 {x) —q x (x)) for a.e. xeR. Recall that q 2 2= q x . If q 2 {x) = q\{x) for a.e. xeR, then (2.3) holds for A= Ai. If q 2 {x) > q t (x) for xina set of positive measure, then (2.4) implies that A 2 «S Ai. In this case for a.e. xeR and y such that \y \ < °° andL(x) ^ y *£ q\{x) , we have 0*Se(x, g,(x))-e(x, y) - A, {q>{x) -y) ^ e(x, q l (x))-e(x, y) - A 2 (<7,(*) -y). That is for a.e. jcc/?, (2.5) /(*, y, A 2 ) ^ /(*, <?,(*), A 2 ) =£ t{x, q 2 (x), A 2 ) for |y| < oo, L(x) ^y^qi(x). 422 L. D. STONE Combining (2.5 ) and (2.2 ) we obtain (2.3 ) with X = \ 2 . Thus, qi is optimal. By repeating the argument for 93, q4, • • •■> the theorem is proved. We now shift our attention to the case where R is a countable set. That is for some countable subset J of the integers, R — { xy.j e J }. Let £(«)= £ e(*j, </(*,)) JO C(q)=^q(xj). JO Carry over the definitions of incrementally and totally optimal sequences in the obvious way. One may use the method of proof given in Theorem 2.1 to show that incrementally optimal sequences are totally optimal for the case where R is countable provided that the existence of a real number A such that (2.6) S(xj, q*( Xj ), A) = sup {S(x h y, K) : \y\ *s °o and L(x) <y*£ U(x)} for je} is a necessary condition for q* to satisfy (2.7) E(q*) = max {E(q) : 9 e<P and C(q)=C(q*)}. From Corollary 5.3 and Remark 2.3 of [6] we conclude that if e(xj, •) is a concave function foryej, then (2.6) is necessary for (2.7). Thus, we may state the following theorem. THEOREM 2.2: If R is countable and e(xj, • ) is concave for;'cJ, then any incrementally optimal sequence is totally optimal. The following example shows that one cannot remove the assumption that e(xj, •) is concave in Theorem 2.2. The example also shows that (2.6) is not necessary for (2.7) when e(xj, •) is not concave for ji J. EXAMPLE 2.3: Let /?={!, 2} be a doubleton set, L = 0, i/(l) = 2, and (7(2) = V3. Define (1 e(l,y) = |y 0=£y*£2, e(2,y)=] 2 y 0*Sy«U Note that both e(l, •) and e(2, •) are everywhere differentiable. For *£ T 1 ^ 2+ V3, define o^r< V3 W. i) [t-VB, V3^T^2+V3 17(2, T)- | ^ V3<r^2+V3. Then 17(1, •), £=1,2, is increasing, and for each T 3* 0, C(tj(-, T)) = T and E(r}(-, T)) gives the maximum of E(q) over all nonnegative functions q defined on {1, 2} such that C(q) = T. Note that INCREMENTALLY OPTIMAL ALLOCATIONS 423 for q* = r}(-, 1), (2.6) is not satisfied for any X. An example of a function q* which satisfies (2.7), but for which there is no A. satisfying (2.6) is also given in [8]. One may check that the sequence of allocations {q\, q 2 ), where qi(l) — l, qi(2) =0and g 2 (l) = 1» <jr 2 (2) = l, is incrementally optimal. However, ^(q 2 ) = l and C(q 2 ) = 2, while E(ri(-, 2)) = 2-y 2 V3>l so that 92 is not optimal, i.e., (gi, q 2 ) is not totally optimal. In [2, p. 328] it is claimed that (in our notation) the existence of a function 17 defined on /?X [0, S] such that tj(-, T) is an optimal allocation of T amount of effort for each ^ T ^S and r)(x, •) is in- creasing for xeR guarantees that incrementally optimal sequences are totally optimal. Example 2.3 shows that for discrete R this claim does not hold. If in addition to the existence of a function 17 satisfying the above conditions we have that for each amount of effort there is an almost everywhere unique opti- mal allocation of that effort, then any incrementally optimal sequence is totally optimal. Although not stated as such, this result is proven in [2]. Example 2.3 shows, of course, that optimal allocations need not be unique. Even when E and C are defined as integrals with respect to Lebesgue measure on rc-space as is done in section 1, optimal allocations need not be unique. In fact, it is easy to see that if there exists a subset D of R having posi- tive measure such that for xeD the graph of e{x, •) contains a nondegenerate straight-line segment of slope A., then there are amounts of effort for which an optimal allocation of that effort is not almost everywhere unique. REMARK 2.4: Let us return to the search situation described in section 1. That is, L(x)=0, U(x) = 00 for xeR, e(x, y) =f(x)b(x, y) for(x, y)efi. Suppose that an optimal allocation q\ has been performed and that the search has failed to detect the target. Let f\ be the posterior target location density given failure to detect the target. Thus (2.7) /,(*)= l-E( qi ) For xeR, let bi{x, •) be the conditional local effectiveness function at x given that q\{x) search effort density was placed at x and the target not detected. Then <<yn\ u t \ &(*,<? ! (*) + y) ~b(x, qi(x)) (2-8) M*»y)= i — 77 TV\ 1 — b{x, qi(x)) Suppose that h is an allocation of effort which is added onto the original allocation </i, so that the re- sulting total effort density is q\{x) + h(x) for xeR. Then (2.9) EAh)=f f 1 (x)b l (x,h(x))dx is the conditional probability of detecting the target given that allocation qi failed. Fix an increment of effort T. Suppose h* has the property that f R h*(x) dx=T and E l (h*) = max f £,(M:h^0and f h(x)dx=T\ 424 L. D. STONE Then h* is sometimes called a conditionally optimal search. If we let qz = qi + h*, then we claim (<7i, (72) is an incrementally optimal sequence. To see this, we observe that by (2.7) and (2.8), ti(h) — — — 1 -E{q x ) Thus maximizing E\ subjected to h 2* and I h(x) dx = T is equivalent to maximizing E subject to q*zq\ and C{q) — C{q\) + T. The claim now follows from the definition of incremental optimality, and we see that the concepts of incremental and conditional optimality coincide for searches of the type discussed in this paper. Hence, under the conditions of Theorem 2.1 or 2,2, a sequence of conditionally i optimal searches (hi, h 2 , . . .) produce, by setting qt = >) n k a totally optimal sequence (qi, q 2 , . . •) fc=i of search allocations. 3. EXISTENCE THEOREM In this section we find conditions under which uniformly optimal search plans or allocation sched- ules exist. More precisely let J be an interval of real numbers. Then an allocation schedule over J is a Borel function 7} defined on R X J such that for TeJ, tj(-, T)e<P and for a.e. xeR, T)(x,-) is increasing. An allocation schedule tj is uniformly optimal if (3.1) C(i)(-,T)) = T and E(r)(-,T)) = max{E(q) :C(q) *iT} {orTeJ. This definition is a generalization of the definition of uniform optimality for search plans given by Arkin in[l]. In the special case where E(q) gives the probability of detection resulting from the allocation of search effort, q, we call 17 a search plan. Then a uniformly optimal search plan maximizes the probabil- ity of detection at each instant during the search. In order to prove the existence of such allocation plans we define a notion of coverability similar to the one in [7]. Suppose p is a real- valued function defined on an interval J of real numbers. If p is concave, then throughout the interior of its domain, p' exists a.e. and is decreasing. Moreover, if p is continuous, then p(t) — p(s) = I p' (r)dr for s, tej. By an extreme point of a concave function p, we mean a point on its graph which does not lie on a chord joining two other points on the graph. Define m(x,-) to be the minimal concave majorant of e(x,') for all xeR for which such a majorant exists. We say that m covers e if the following conditions are satisfied, (i) For a.e. xeR, m(x,-) exists and is continuous, (ii) m is a Borel function. (iii) e(x,y) = m(x, y) whenever (y, m(x,y)) is an extreme point of m(x, •)• Note that condition (iii) is equivalent to assuming that e(x,-) is upper semi-continuous at y such that (y, m(x,y)) is an extreme point of m(x,-). For ge<P we define M(q) = \ m(x, q(x))dx Jr whenever the integral on the right exists. INCREMENTALLY OPTIMAL ALLOCATIONS 425 Differentiation is always with respect to the last component of the argument, and is denoted by a prime, e.g., for (x, y)eil, e (*, y) = lim [e(x, y+ h) — e{x, y)]lh. Let m cover e. If a function </€<P and A-»0 a real number k satisfy, for a. e. xeR , m (x, y) ^ k for a. e. y such that L(x) < y < q{x) m' (x, y) *S: \ for a. e. y such that q(x) < y < U{x) , then we say that the pair (q, k) satisfies the Neyman-Pearson inequalities. When e(x,-) and m(x,-) are increasing and U(x) = °°, it is convenient to define e(x, °°) =lim e(x, y) and m(x, °°) =lim m(x, «>). Before proceeding with our main existence result, we prove two lemmas which will be useful in this section. Lemma 3.1 relates closely to Theorem 1 and Remark 3 of [5] . LEMMA 3.1: Let m cover e. If there is a q*e<$> such that E(q*) > — °° and k S* such that for a. e. xeR (3.2) then (i) m'(x,y) 5* k for a. e. y such that L(x) < y < q*{x) (ii) m' (x, y) ^ k for a. e. y such that q*(x) < y < U(x) (Hi) e(x, q* (x)) = m(x, q*(x)), (3.3) E(q*) = max {E(q) : C(q) ^ C(q*)}. PROOF: By (3.2) (iii), M(q*) exists. It is an easily shown Neyman-Pearson result (see Theorem 1 of [5]) that for k 5* 0, (i) and (ii) imply (3.4) M(q*) = max {M(q):C(q) ^ C(q*)h Suppose that there is an re<I> such that E{r) > E{q*) and C(r) *£ C(q*). Since m majorizes e, we have M(r) ^E(r) >E(q*) = M(q*), which contradicts (3.4). This proves the lemma. For k > and x such that m(x,-) exists, we define if u (x, k) = sup {y : y = L (x) or m! (x, y) 3* k) <Pf(x, k) = inf {y.y = U{x) or m' (x, y) =£ k }. Then for k > 0, we let l,(k)= I <Pf(x, k)dx, I u (k) = \ <p u (x,k)dx. 426 L. D. STONE The functions ip ( and <p u will be our main tools for constructing solutions to the constrained extremal problems considered here. The following lemma displays some of the properties of these functions. I ..EMMA 3.2: Suppose m covers e and for a.e. xeR, e(x,-) is increasing. If— « < E(L) =£ E(U) < oo and |C(L)| < °°, then the following hold: (a) (£>„(•, X)e<I> and tp f (-, X)e<I> for X > 0. (b) /^,and /„ are finite and decreasing. (c) <ps(x, •) and I ( are right continuous and <p u (x, ■) and /« are left continuous. (d) ((f{(-, X), X) and (<p u {-, X), X) satisfy the Neyman-Pearson conditions. (e) A pair (q, X) , where <?e4> and X 2= 0, satisfies the Neyman-Pearson inequalities if, and only if, <p/(x, X) =£ q(x) *£ <p u (x, X) for a.e. xeR. (f) For any X > 0, we may find a Borel function a defined on R X [/^(X), / U (X)], such that (1) a(x, •) is increasing for a.e. xeR, (2) C(a(-, T)) = TforI, (X) < T*£ J U (X), (3) (a(-, T) , X) satisfies the Neyman-Pearson inequalities for all/^(X) =£ 7 1 =£ / U (X). (g) lim /„(X)=C(L). (h) For X > and x such that m(x, •) exists, (<p u (x, X), m(x, <p u {x, X)), and (^>,(x, X), m(x,<p ( (x,\)) are extreme points of m(x, •)• PROOF: A straightforward verification shows that (£v(\ X) and <£>„(•, X) are Borel functions for each X > and that (a) holds. Thus, the integrals, //(X) and 7 M (X) are well defined for each X > 0. For a.e. xeR, the following hold. Since e(x, •) is increasing, m{x, •) is increasing. If U{x) is finite, then (U (x), m(x, U(x))) is an extreme point and m(x, U(x)) — e(x, U{x)). If U(x) =», then the in- creasing nature of e(x, ■) and the minimal nature of m(x, •) yields m(x, o°) — e(x, °°). Since |C"(L)| < «», L(x) is finite and m(x,L(x)) = e(x,L(x)- ). To prove (b) , we observe that - oo < E(L) = M(L) *s £(£/) = M(U) < oo. Thus, for a.e. xeR, m{x, L(x)) and m(x, U(x)) are finite. Since m(x, •) is increasing, we have for a. e. xeR, m(x, U(x))-m(x,L(x)) 2* f* m'{x,y)dy^ (z- L(x)) m' (x, z) for L(x) <z<U(x). Thus, m (x, z) =£ (m(x, U(x)) -m(x, L(x)))l{z- L(x)), and it follows that ip u {x, X) ^f [m(x, U(x))-m(x,L(x))]+L{x) forX>0. X Hence, 1 A oo <C(L) ^^(XJ^/^X) *£ 7 [M(£/)-M(L)]+C(L) <oo forX>0 INCREMENTALLY OPTIMAL ALLOCATIONS 427 which proves that If and I u are finite. The decreasing nature of m' (x, •) for a. e. xeR guarantees that ^ M (*» *) SLndtp/ {x, •) are decreasing for a. e. xeR. Thus, (b) follows. The left continuity of <p u (x, ') and the right continuity of <p({x, •) for a. e. xeR follow from their definitions and the decreasing nature o( m' (x, •). The monotone convergence theorem and the finite- ness of /« and / e may be used to show the left and right continuity of /* and // , respectively. Thus, (c) holds. Properties (d) and (e) follow directly from the definition of <p{ and <p u . In order to prove (f), we use a device of Arkin's [1] and define for =£ s =£ °° ((fu(x, \) if |*| < o <p,(x, A) if |«| 35 5 and Hk(s) = /ik(x, s)dx. JR By the monotone convergence theorem, H\ is continuous. Moreover, H\ is increasing and //x(0)=MA), //x(»)=/ u (X). Thus for //(A.) ** T «£ I u (k), we may choose £(T) such that H\(t;(T)) = T. Defining a(x, T) — h\(x, i(T)) for xeR, we see that a satisfies conditions (1) and (2) of (f)- Condition (3) follows from (e). Property (g) follows easily from the monotone convergence theorem and the definition of <p u . Prop- erty (h) may be verified from the definitions of <ps, <p u , and an extreme point. This completes the proof. THEOREM 3.3: Suppose m covers e, and for a. e. xeR, e(x, •) is increasing. If — °° < E(L) < E(U) < °° and |C(L)| < °°, then there exists a uniformly optimal allocation schedule tj over [C(L), C(U)). PROOF: We consider first the case where / M (0) = lira I U (X)=C(U). The case 7„(0) < C(U) requires only routine modifications which are discussed at the end of the proof. We take7j(-, C{L)) =L. Since I u is monotone, it has only a countable number of discontinuities. Let K be a countable index set such that {kk'.keK} is the set of discontinuity points of /„. Let J k — U/>(kk), Iu(kk] for keK. The intervals J k are disjoint and are the jump intervals at the discontinuity points of /„. For T € (C(L),C(U))- U 7*,let ktK X*(7-) = sup{\ :I u (k) = T}. By the left continuity of /„, I u (k*{T)) = T. For TcJk, let X*(7') = A* and choose a function a* defined on R X/ t to have the properties of a in (f) of Lemma 3.2. Define r <p u (x,\*(T)) \iC(L)<T<C{U) andT* uA l atk(x, T) if TeJk and keK. 428 L. D. STONE Then for each C(L) <T< C(U),C(t)(-, T)) = T and (t/(-, T), X*(T)) satisfies the Neyman-Pearson conditions. Since m covers e and property (h) of Lemma 3.2 holds, we have that for each C(L) < T < C{U), e{x, 17 (x, T)) = m{x, t}(x, 7')) for a.e. xeR. Thus, by Lemma 3.1, 17 satisfies 3.1. To verify that 17 (x, •) is increasing for a.e. xeR, we let R' be the set of xeR such that m(x, •) exists. Then by the fact that m covers e, R — R' has measure 0. Suppose it is not the case that rj(x, • ) is increasing for a.e. xeR. Then there is an xeR' and numbers S and T, such that C(L) < T < S < C(U) and (3.4) r ) (x,S)<r ) (x,T). Since (tj(-, T), k*(T)) and (tj(-, S), \*(7')) satisfy the Neyman-Pearson inequalities for all xeR' , we have A*(7) *£ m' (x, y) ^ X*(S) for a.e. y such that -q(x, S) < y < f){x, T). One may check that A* is a decreasing function, so that A* (7*) = A*(S). Thus, for some keK, T and S are both in the closure of Jh,- However, w(x, •) is constructed by property (f) of Lemma 3.2 to be increasing on the closure of A-. This contradicts (3.4) and proves the theorem for the case where I u (0)=C(U). If/„(0) < C(U), we proceed as before for C(L) < T < /«(0). We then define <p tt (x, 0) = lim <p u (x, A). x-»o From the increasing nature of e(x, •) , it follows that if gc4> and if q(x) ^ <p u (x, 0) for x€R, then q will satisfy (3.2) with A = 0. Hence, to complete the definition of 7)(x, •) for 7«(0) < T < C(U), one need only choose 17 so that 7}(x,T) 3* <p„(x, 0) and C(t/(-, T)) = T which may be easily done. This completes the proof. Observe that the hypotheses of Theorem 3.3 may be weakened to require that m(x, •) rather than e{x, •) be increasing for a.e. xeR. The theorem remains unchanged except that 17 must be restricted to [C(L),/„(0)]. This is no real restriction since for q ^ (p u {\0),E(q) =£ E(<p„(-,Q)). Theorem 2 of Arkin [1] is similar to Theorem 3.3 above with the exception that [1] claims that there exists a function fi such that 17U, T) =Vp(x,s)ds, C(t,(,T)) = T, and 17 is uniformly optimal. However, the following is a counterexample to Theorem 2 of [1]. (More- over, the proof in [1 1 is not sufficient to show the truth of Theorem 3.3.) Let/?= [0, 1],L = 0, e/ = oo, and 0«r<l y?l for xeR. It is clear that any uniformly optimal search plan 17 must have the property that for a.e. xeR, ■q(x, •) jumps from to 1 at some point T, but there is no function /3 which produces this behavior for 17. INCREMENTALLY OPTIMAL ALLOCATIONS 429 Under the conditions of Theorem 3.3. we have shown that there exists, for any C(L) ^T< C(U), a q* such that C{q*) = T and E(q*) = max {£(</) : ge<P and C(q) =£ T}. Theorem 8 of [7] provides a similar existence result whenever m covers e and — °° < C{L) =£ C(U) < °°. In comparison, Theorem 3.3 of this paper removes the restriction that C{U) < °°, but adds monotonicity conditions on e(x, •) and boundedness conditions on E. In [6] there is also a discussion of related existence theorems. One might conjecture that Theorem 3.3 would remain true without assuming that e(x, •) is in- creasing, provided that we assumed \E{q) \ < B for some number B and all qe<5>. Similarly, one might conjecture that the restriction C{L) > — oo could be omitted. However, the following two counter- examples show both of these conjectures to be false. EXAMPLE 3.4: Let« = [1, °°), L = 0, and U(x) =x+ 1/x 2 for xeR. For xeR, define r y, *£ y =£ 1/x 2 e(x, y)= < 1 y-1/* 2 r 2 -v3 l/* 2 <ys=*+l/* 2 . Note that m = e and that \E(q) | ^ 1 for all qe<P. Suppose q* is optimal and °o>C(<7*) >1. By Corollary 7.2 of [6] there exists a X such that (3.5) e(x,q*(x))-kq*(x) = sup{e(x,y)-\y:0^y<x+llx 2 } for a.e. xd?. Since e(x, •) is concave for xeR, this implies e'(x,y)^k iorO<y<q*(x) e'(x, y) =£ X for q*{x) < y < x + Ijx 2 , for a.e. xeR. One may check that if X ^ 0, C(q*) ^ 1. Thus, the above X must be negative. It follows from the above inequalities that q*{x) = x + 1/x 2 for x 3 > — 1/X. Thus, C(q*) = °° which contradicts our assump- tion that C(q*) < <». Thus, one cannot replace the monotonicity of e(x, •) by boundedness of E in Theorem 3.3. EXAMPLE 3.5: Let ft = [1, °°), L = - 1, tf= l.and e(x, y) = ylx 2 for — 1 =£ y *£ 1. Observe that e = m and all the conditions of Theorem 3.3 are satisfied except that C(L) = — ». Suppose q* is optimal and C(q*) is finite. Again by Corollary 5.2 of [6] there exists a X such that e{x, q*{x)) — kq*(x) = sup {e(x, y) — Xy: — 1 *£ y ^ 1} for a.e. xeR. 430 L. D. STONE Hence, It follows that e'(x,y)&\, -Ky<q*(x) e'(x, y) «S k, q*(x) < y < 1, for a.e. xdl. 1 for a? < 1/X - 1 for x 2 > 1/X. q*(x) = l Hence, either C(q*) = — » or C(q*) = + oo contrary to the assumption that C(q) is finite. Thus, we cannot omit the condition C(L) > — °° in Theorem 3.3. REMARK 3.6: In the search theory case where L(x) = 0, U(x) — °° for xeR and e(x, y) =f(x) b(x, y) for (x, y)efl, the conditions of Theorem 3.3 will be satisfied if b{x, •) is right continuous for xtR. Since b(x, •) is increasing and \E(q) | =£ 1, for ge4>, the only condition that is not obviously sat- isfied is the coverability condition. However, since b(x, • ) is increasing and right-continuous, it is upper semi-continuous. Thus, e(x, •) has a minimal concave majorant m(x, •) which is continuous, and one may check that e(x, y) = m(x, y) whenever (y, m(x, y)) is an extreme point of m(x, •). It can be shown that since e is Borel, m is a.e. equal to a Borel function. Thus the conditions for coverability are satisfied. It follows that whenever the local effectiveness function is right continuous and the target location distribution is given by a density function on Euclidean re-space, a uniformly optimal search plan exists. Note that uniformly optimal search plans may be used to produce sequences which are both incrementally and totally optimal. REFERENCES [1] Arkin, V. I., "Uniformly Optimal Strategies in Search Problems," Theor. Probability AppL 2, 674-680 (1964). [2] Dobbie, James M., "Search Theory: A Sequential Approach," Nav. Res. Log. Quart. 4, 323-334 (Dec. 1963). [3] Everett, Hugh, "Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources," Operations Res. 77, 399-417 (1963). [4] Koopman, B. O., "The Theory of Search: HI. The Optimum Distribution of Searching Effort," Operations Res. 5, 613-626 (1957). [5] Wagner, D. H., "Non-Linear Functional Versions of the Neyman-Pearson Lemma," SIAM Rev. 7, 52-65 (June 1969). [6] Wagner, D. H. and L. D. Stone, "Necessity and Existence Results on Constrained Optimization of Separable Functionals by a Multiplier Rule," To appear in SIAM J. Control, 12, (1974). [7] Wagner, D. H. and L. D. Stone, "Optimization of Allocations Under a Coverability Condition," To appear, SIAM J. Control, 12, (1974). [8] Zahl, S., "An Allocation Problem With Applications to Operations Research and Statistics," Operations Res. 77, 426-441 (1963). AN APPROACH TO THE ALLOCATION OF COMMON COSTS OF MULTI-MISSION SYSTEMS* Robert Thomas Crow School of Management State University of New York at Buffalo ABSTRACT Many Naval systems, as well as other military and civilian systems, generate multiple missions. An outstanding problem in cost analysis is how to allocate the costs of such mis- sions so that their true costs can be determined and resource allocation optimized. This paper presents a simple approach to handling this problem for single systems. The approach is based on the theory of peak-load pricing as developed by Marcel Boiteux. The basic principle is that the long-run marginal cost of a mission must be equal to its "price." The implication of this is that if missions can cover their own marginal costs, they should also be allocated some of the marginal common costs. The proportion of costs to be allocated is shown to a function of not only the mission-specific marginal costs and the common marginal costs, but also of the "mission price." Thus, it is shown that measures of effectiveness must be developed for rational cost allocation. The measurement of effectiveness has long been an intractable problem, however. Therefore, several possible means of getting around this problem are presented in the development of the concept of relative mission prices. THE PROBLEM This paper is an attempt to provide a new method of allocating the common costs of new invest- ments in a multi-mission system to individual missions in a way that is (1) operational, (2) objective, and (3) defensible from the point of view of efficient resource allocation. t The most important reason for allocating common costs to individual missions is to provide guidance for procurement of systems and to estimate the costs of accomplishing given missions by alternative systems. If common costs are properly allocated, it is possible to estimate the true costs of accomplishing a mission with one system compared to another. For an illustration of the problem, consider Table 1. Five systems are shown that can be combined to accomplish three missions. Systems A, B, and C are single-mission systems and therefore, by defini- tion, have no common costs. Systems D and E are multi-mission systems characterized by significant common costs, as well as incremental costs which are specific to each mission. How much does it cost to accomplish the missions by the use of multi-mission systems? Which systems— single-mission, multi-mission, or some combination — should be procured? Obviously, military systems problems are too complex to be accurately characterized by such simple questions. Yet it appears that in day-to- *The work on which this article is based was performed for the Chief of Naval Operations, Systems Analysis, as part of Contract No. N00014-70-C-0086 with Mathematica, Inc. This paper is a revision of portions of a report [4] prepared for that contract. t Common costs are defined as those which are incurred by a single system regardless of which of a number of missions is being performed. They may arise from either operation or investment. It should be clear that, since the subject is investment decisions, it is the incremental common costs that are to be allocated. 431 432 R- t. crow day operations, they are often posed in this fashion — for first-approximation purposes, if not for final decisions. TABLE 1. Comparison of Costs of Single- and Multi-Mission Systems Costs Systems A B C D E — — — 50 60 65 — — 20 — — 45 — 30 30 — — 60 — 10 Totals 65 45 60 100 100 As an example of the consequences of misallocating common costs, if in a given system all common costs are allocated to a particular important mission, it may appear that the system's capability in that mission is costly relative to other systems and it may not be purchased. Even if it is purchased, it may be underutilized. On the other hand, other mission capabilities of the system may appear to be less costly than they are in fact, leading to overpurchase or overutilization. In general, the allocation of common costs has been avoided by economists. Most work on common costs has focused on short-run marginal cost analysis and hence only on the costs that are variable for each specific output (or mission, in our context).* This focus on short-run problems side-steps the issue of how to handle common costs in investment decisions, where there are normally several alternative courses of action and no costs are fixed. For pricing and other types of decisions, in both the private and public sectors, if other than short run marginal or incremental prices are considered, reliance has usually been placed on arbitrary allocations of common costs to some or all outputs. This was the general rule until 1949, when Marcel Boiteux wrote an ingenious and basic article on peak- load pricing for electrical utilities, translated into English in 1964 (Ref. [2]). In the approach presented here, two principles will be emphasized: (1) (following Boiteux) efficient allocation of common costs can and must be based on "marginal" conditions; and (2) efficient allocation must be based explicitly on considerations of how well a given system performs a given mission, as well as an assess- ment of the need for capability in that mission. SOME BASIC PRINCIPLES A basic theorem of resource allocation is that in a competitive economy the maximation of output in equilibrium occurs when price (P) equals the cost of producing one additional unit of output (mar- ginal cost or MC). The P = MC output can be proven to be optimal if competitive conditions hold throughout the economy (assuming the existence of U-shaped average cost curves). This is a funda- mental principle in the ensuing discussion on cost allocation. *For an example, see Ref. [3, ch. 5]. Professor William Baumol has pointed out in correspondence that there have been a number of recent exceptions to this general rule for civilian systems, such as utilities. In particular, see Ref. [6]. COST OF MULTI-MISSION SYSTEMS 433 Furthermore, it is necessary to recognize that before any costs are sunk, they are all variable. Thus, in planning for investment, the criterion that price must equal marginal cost means that the marginal cost measure must include investment cost. Assuming for the moment that a set of "prices" can be established for the value of accomplishing a military mission, the widely used criterion of price equal to short-run marginal costs (marginal operating costs) is only valid as the minimum price at which a particular system should be used in a given mission. In other words, for a given system, if the mission is not needed sufficiently to be worth sacrificing enough resources to pay for its marginal operating costs, then the mission should not be performed. An example of the possible consequences of using only short-run marginal costs in system decisions is shown in Figures 1 and 2. The amount of output of each two systems is q and the vertical distances Of i and Of 2 represent investment costs, while the vertical distance of the shaded areas represent marginal costs. In this case, it is clear that short-run marginal costs are lower for system 2 than system 1; however, in the long run, the total cost of system 1 is less because its investment costs are lower. Therefore, it should be chosen in spite of its higher short-run marginal costs. PRICE, " COST QUANTITY Figure 1. Output with high marginal operating costs and low marginal investment costs, System 1. PRICE, COST Of; QUANTITY Figure 2. Output with low marginal operating costs and high marginal investment costs. System 2. 434 R. T. CROW In order to include investment costs in the marginal cost measures, it is necessary to distinguish long-run from short-run costs. Long-run costs represent situations where output is variable through investment, rather than by changing utilization of existing capacity (which can occur over the short- run). The question is how to reconcile the short-run condition of price equal to marginal cost with the necessary condition of long-run marginal costs being covered. The solution is deceptively simple: for a given objective to be achieved, purchase that number of units of the system at which the price of each unit equals both the short-run and long-run marginal costs. It is necessary to demonstrate this assertion.* Real systems often are relatively inflexible in their capability. That is, after a point, significant increases in operating costs will expand output very little, i.e., the cost curves have an "elbow" where they become vertical when capacity is reached. It can be proven that for a family of such short-run curves, representing either expanded units or additional units, the envelope of the short-run curves is the long-run total-cost curve. This is illustrated in Figure 3. This is the case that will be dealt with LONG RUN q, q 2 q 3 q 4 q 5 q 6 q 7 Figure 3. Relationship of long-run to short-run total costs. here in demonstrating that optimum resource allocation calls for equality of short- and long-run mar- ginal costs. t In the ensuing discussion it will be useful to identify short-run costs with operating costs, and long-run costs as operating plus investment costs. In dealing with a system with very little flexi- bility, we are able to illustrate a means of approximating the solution for the case in which the system is completely inflexible. The following notation will be used: C: total cost variable over the short run (only operating costs) C : total cost variable over the long run (operating plus investment costs) Cf. total investment cost (fixed in the short run) *A rigorous treatment is presented in an appendix to Ref. [2]. tThis can be relaxed, but at the sacrifice of simplicity in exposition a.. J application. See appendix to Boiteux [2]. COST OF MULTI-MISSION SYSTEMS 435 q: output of a system q : capacity which meets some given requirement z: marginal operating cost (variable in the short run) = dCjdq x: marginal investment cost (variable in the long run) = dC/ldq y: marginal long run cost = dC /dq w: an approximation of z for q< q a , defined by (C — C/)lq, i.e., the average avoidable cost. For q < q , let the total cost function be* (1) C = C f +wq. Over the long run, adjustment can be made to the required capacity (i.e., investment can be varied). Therefore (1) can be expressed as (2) C = Cf(q ) + wq , and, assuming that w does not vary with the number of units, a\ dC dC f (3) = (q ) + w = y. dq dq That is, the long-run marginal cost of a number of units is (4) y=x + w, by the definitions of y, x, and w, i.e., the long-run marginal costs are equal to the sum of the marginal investment cost and the average avoidable cost. If there is true inflexibility at q , short-run marginal cost is indeterminate. However, if there is a slight bit of flexibility, z becomes very large as q — * q* Therefore, it becomes equal to y at some point on its vertical arm. At this point, long-run and short-run marginal are equal, which establishes the solution of price equal to long-run and short-run marginal costs, (5) p = y=x + io=z. This is illustrated in Figure 4. The Basic Principle of Common Cost Allocation Assume that a given system can perform two missions, and its investment cost is common to both. That is, there are no characteristics of the system that can be attributed uniquely to one mission or another.* * This section follows Boiteux, [2]closely. Any errors are likely to be mine, not his. t Linearity is assumed for convenience and because in applied work one often has only linear approximations in any case. The linearity assumption does not appear to have a material effect on the analysis. 436 R. T. CROW Figure 4. Equality of short-run and long-run marginal costs at capacity output. In the case of two missions being performed by a single system, the cost functions for each of them, with flexible capacity, are (6a) and (6b) C, = C/(q ) + Wiqi C 2 = Cf (q ) + w 2 q 2 , where q\ and q 2 represent the output of missions 1 and 2, respectively. Given a particular output requirement, say q u the requirement for optimum resource allocation is that the marginal long-run cost for both missions together be minimized, i.e., (7) a(C + c 2 ) dq = 0. To establish the equality of long- and short-run marginal costs, consider the differential of C t + C 2 , (8) d{Cl + Ci) = ~dq~ t qi + ~dQ~2 dq2 + 2 BQo dQ °- The first two terms on the right hand side are short-run marginal costs, z\ and z 2 , times their respective variations in output. The third term must be equal to zero for cost minimization to hold. The differential d{C\ + C 2 ) may also be written as (9) dC\ + dC 2 = - — dq + Widqi + - — dq + w 2 dq 2 loq J ldq J COST OF MULTI-MISSION SYSTEMS where dq x — dq% = dq * (9) may be divided by dq to yield 437 (10) dCi + dCi n dC/ j = 2~ h W\ + W2 . dq aq Since dCf/dq has been defined above as x, the optimum condition of equality of short- and long-run marginal costs is (11) z\ + z% — 2x + Wi + wi. For the prices of two missions to be equal to the long-run marginal costs of a given system capacity, (12) Pi + Pi. = 2x + w\ + w 2 . The allocation of common investment costs follows immediately from (12) which implies, (13) (pi - w x )l2x + (p 2 - w 2 )/2x = 1. That is, the share of common costs to be allocated to each mission performed by the system is equal to the corresponding term in (13). Before turning to the problem of how the prices of missions are to be determined, it is instructive to consider two particular cases. One is the case where one of the missions (mission 2, for example) has a price such that it is just worth its marginal operating costs, p-z = w. In this case all common costs are allocated to mission 1 by (13). This is illustrated in Figure 5. FORCE q(;q ) LEVEL Figure 5. One system bearing all common costs and the other system bearing only operating rosts. "Boiteux [2, appendix] has shown that this is likely to be a reasonably good approximation over a wide range of conditions. 438 R. T. CROW It is important to note that this is a generalization of the "two-ship" method of allocating invest- ment costs currently in use (Grey, [5] pp. 2, 3, sec. 2-4). In this method, "major" missions are al- located the entire common investment cost of the system plus their incremental investment costs, while "minor" missions are allocated only their incremental investment costs. In the case shown in Figure 6, the problem is different in that each mission's price is such as to be able to cover at least a portion of its common marginal investment costs. In this case, the question arises: Is one mission allocated only operating costs and the other allocated both its operating and the common investment costs? And further, if the common investment costs are shared, how are they di- vided? The answer is that since pi and pz both exceed w (assumed to be the same for both missions to keep the diagram simple), common costs must be allocated according to (13), where both terms will be greater than zero. Thus each mission will bear a share of the common cost. COST, PRICE P| ; ; -^-— -""■"""o P 2 o -2y q(=q ) Figure 6. Both systems bearing common costs. FORCE LEVEL THE PROPOSED APPROACH TO PRICE DETERMINATION AND COST ALLOCATION Military cost-effectiveness analysis generally takes one or another of the following forms: (a) maximize effectiveness subject to a cost constraint, or (b) minimize cost subject to achieving a given level of effectiveness. In this paper, attention will be devoted to the latter, but the basic approach is also applicable to the former. As seen from the discussion above, it is critical to be able to set the prices of various missions. At first, this requirement appears unorthodox and difficult; but, in fact, it is not far from existing prac- tice. In the case of a single system, the price of one unit of the system (say one plane), plus the wages of personnel, the expenditure for fuel, and whatever else is necessary for the unit to perform its mission, can be considered to be the price of the mission. Expressed somewhat diffc ently, the amount of ex- penditure necessary to meet a given mission requirement divided by the units of a system which must be used is the price of that mission as performed by a particular system. COST OF MULTI-MISSION SYSTEMS 439 The Alternative System Method Consider two single-mission systems which perform different missions. The expenditures on these two systems are shown in Figures 7 and 8. In order to meet effectiveness requirements for each, r/i and<?2 units are required respectively. To meet these requirements, expenditures oiOpiCiqi and Op-zCzqi are necessary, implying mission prices of pi andp2- If a third system performed both missions jointly, it might be procured in such quantity (93) that it met the requirements of mission 1 and part of the requirements of mission 2. This is shown in Figure 9. If we assume that the operating costs of system 3 are identical to those of systems 1 and 2, and the in- vestment costs are exactly the same for system 3 as for the sum of systems 1 and 2, then, at the optimum, the prices of the missions for the multi-mission system are equal to those of the single-mission systems. Pi + P2 = 2 (x 3 + w). This follows from (12), and allocation of common costs (investment costs in this case) follows (13). That is, in this particular case the shares of common costs differ solely because of the difference in mission prices. pi C| Q| FORCE LEVEL FIGURE 7. Average price and required units of system in Mission 1 (alternative system method). ^2 FORCE LEVEL FIGURE 8. Price and required units in Mission 2 (alternative system method). 440 R. T. CROW COST, PRICE Figure 9. Costs and prices of both missions with a single system. Clearly the assumptions of identical costs is unrealistic. There is no reason to assume that a multi- mission system will cost precisely the same as the sum of two single-mission systems. In fact, if it did there would be no apparent reason to buy it. The only apparent reason for a multi-mission system is that it meets a set of requirements less expensively than single-mission alternatives. Thus, to preserve the equality of prices and marginal costs for the multi-mission system, it is necessary to scale either the marginal costs or the prices. Since it makes no difference which is chosen, prices are scaled: P\3 + P23 = n(p l +p 2 ) = 2(x 3 + w), where P13 and p 2 3 are the prices of missions 1 and 2 as performed system 3, and (14) n = 2(x 3 + w) (Xi + W) + (X 2 + W) from the conditions Pi = x\ + w and p-2 = X2 + w, established in (5). Thus, the proportions of common investment cost (for example) allocated to different missions are the respective terms of (npi — w)l2x 3 + (np 2 — u,l2x3 = l. Next, consider the systems where certain investment and operating costs can be attributed to specific missions. The following notation will be used: P13 = price of mission 1 performed by system 3 COST OF MULTI-MISSION SYSTEMS 441 P23 = price of mission 2 performed by system 3 u>i3 = an approximation of marginal operating costs for mission 1 performed by system 3 (see definition of w prior to Equation (1)). W23 = an approximation of marginal operating costs of mission 2 performed by system 3. Wz = an approximation of marginal operating costs of system 3 common to missions 1 and 2. 2 13 = marginal operating cost of mission 1 performed by system 3 223 = marginal operating cost of mission 2 performed by system 3 *i3 = marginal investment cost of mission 1 performed by system 3 £23 = marginal investment cost of mission 2 performed by system 3 x 3 = marginal investment cost of system 3 common to missions 1 and 2 The critical condition for optimum resource allocation in the simple case presented in (11) was z\ + z-i = 2x + w 1 + w-i. If there are investment costs and operating costs common to each mission, as well as investment costs and operating costs specific to both missions, the specific costs are directly attributable to their respec- tive missions. The condition can be written as Zl3 + Z23 = 2(*3 + W 3 ) + *13 + W13 + *23 + ^23, which preserves the equality of short- and long-run costs for the system. Since the condition for optimum resource allocation is that output is such that price equals marginal cost, P13 = Z13 and p 2 3 = Z23. Therefore, (15) Pl3 + P23 = 2 (X3 + ">3) + *13 + Wl3 + *23 + ^23 and (Pl3 — W13) + (P23 — *23 — M>23) = 2(* 3 + m) which implies that allocation of common investment and operating costs follows (16) [(Pl3-*13- Wl3)/2(X 3 + M;3)] + [(P23-*23-M>2 3 )/2(x 3 + U;3)] = 1. Of course, in the case where pi andp 2 are given from single-mission systems, then P13 = npi and p 2 3 = "P2 from (14). 442 R T. crow Knowledge of Historical or Simulated Trade-offs If historical data, e.g., Viet Nam, Korea, or World War II, is relevant to the missions in question, perhaps some trade-offs can be established. Force structure analysis might also be useful in establishing such trade-offs. For example, suppose that as a result of such analysis, a trade-off could be established such that the outcome of a campaign would have been the same regardless of whether an amount of "output" mi of mission 1 or m-i of mission 2 were provided. Therefore the price of mission 2 relative to mission 1 is P\ m 2 or Pi = kpi. The price of mission 1 performed by system 3 must be such that (15) will hold, that is P 13 + kp 13 = 2 {X 3 + W3) + X 13 + «>13 + *23 + ^23, which implies (17) Pl3= [2(*3 + W 3 ) + JC13 + W13 + X 2 3 + W23]/ (1 + k) . Since p 23 = kp\ 3 , bothpi 3 andp23 are determined and allocation of common costs follows (16). Expert Opinion Expert opinion may be used to establish trade-offs needed for common cost allocation. This appears to be the implicit assumption underlying the two-ship method, in which judgment must be made as to which mission is "major" and which is "minor." The principal distinction is that under the proposal of this paper, if two missions are believed to be approximately equal in importance, common costs would be allocated according to relative prices, taking incremental investment costs and operating costs into account, rather than allocating all or none of the common costs to each system. If one was thought to be slightly more important for the system in question than the other, relative prices might be set such that pilp2 = 55/45 and so forth. The setting of an absolute price for one mission would follow (17) and the allocation of common costs would follow (16). A Simple Example of Mission and System Comparisons At the beginning of the paper, Table 1 presented the costs of achieving given amounts of output in three missions by five different systems — three single mission and two multi-mission. To illustrate the method, consider the basic problems of the paper: (1) which systems are least expensive in carrying out their multiple missions, and (2) what are the costs of multiple missions supplied by a single system. COST OF MULTI-MISSION SYSTEMS 443 Since we have provided a single-mission system for each of our alternatives, we use the alternative system method of determining mission price. The information of Table 1 and the results of allocation are presented in Table 2. In the illustration, it will be assumed that the entries in the table are costs per unit and that marginal investment and operating costs are constant. Thus, the entries are approxi- mations of marginal costs, i.e., the h/s and x's in the notation above. The functions, such as (6a) and (6b) which would yield these parameters could be developed from statistical cost estimating, industrial engineering studies, analogy or even expert opinion, and should reflect all of the sources of costs, e.g., equipment, fuel, personnel, etc. that make the system operational. Beginning with System D, allocation follows from (14) and (16). That is, tlD — 2XD + Wdi + WD2 (x + w)a + (x + w) B 100 + 20 + 30 , ^ or 65 + 45 =L36 ' TABLE 2. Marginal Costs of Missions 1 , 2, and 3 for all Systems Before and After Allocation of Common Costs Costs Before allocation After allocation Systems Systems A B C D E D E Common 50 60 65 20 55 Mission 2 45 30 30 45 49 Mission 3 60 10 51 and n D (p A +p B ) = 2xi, + w Di + WD2 or 1.36 (65 + 45) = 100 + 20 + 30, which implies (nDPA — wdi)I2xd + {n D pB — wtn)\2xo =1, or 0.69 + 0.31 = 1.00. That is, 69 percent of the common cost is borne by Mission 1 and 31 percent by Mission 2. The long run marginal costs, with the common costs allocated to the specific missions, are therefore: Ym = 0.69(50) + 20 = 55 and Y m = 0.31 (50) + 30 = 45. The same procedure is followed for System E, where 2x E + w E 2 + WE* ^ 2(60) +30 +10 , , 9 UE {x + w) B +(x + w)c 45 + 60 lbA 444 and R. T. CROW {tiePb — wez)I2x e + (n E pc — weh)I2x e =1 or 0.32 + 0.68 = 1. The shares of common cost allocated to Missions 2 and 3, 32 and 68 percent, are then applied to the mar- ginal common costs, and the sums of the allocated marginal common cost plus the incremental cost for each system are Y E 2 = 0.32(60) + 30 = 49 and Y E3 = 0.68(60) + 10 = 51. Now that the costs have been allocated, let us consider the results. Taking System D first, we see that its cost for performing Mission 1 is less than for the single-mission system, System A. Its cost of performing Mission 2 is exactly the same as that of System B. Therefore, it would appear that System D is worthy of consideration for procurement and deployment for Missions 1 and 2, since its cost in each mission is, at worst, no greater than that of competing systems. Turning to System E, we find that it is more expensive in Mission 2 than are Systems B and D, i.e., its incremental cost exceeds its price. On the other hand, it is less expensive than C in its performance of Mission 3. Since its long-run marginal cost for Mission 2 exceeds that of B and D, System 3 will never be used in that mission. This implies that all common costs are to be allocated to missions which it does perform, Mission 3 in this case. However, this means that the long-run marginal cost of Mission 3 is now 70 instead of 51, and it too now exceeds the cost of the single-mission alternative. Thus, it will never be used according to our simplified analysis and does not appear to be a good candidate. This result has a paradoxical element in that even though Zs's marginal costs exceed those of the single mission alternatives when considered separately, the sum of the marginal cost is lower than the sum of the single-mission alternatives. This, of course, is due to all costs being allocated to Mission 3 since Mission 2 is not performed. Does the allocation of common costs thus lead us astray? Would we not be more likely to make good decisions on systems use and procurement if we simply compared total long-run marginal costs and ignore allocation? The answer in this particular example, at any rate (and probably generally, too) is no. Consider the possible ways of accomplishing Missions 1, 2, and 3. The combinations and their associated costs are presented in Table 3. The results of considering al- Table 3. Marginal Costs of Alternative Combinations of Systems Providing Three Missions Combination Marginal costs of systems A B C D E Total A,B,C 65 45 60 170 A,E 65 100 165 C,D 60 100 160 D,E 100 100 200 ternative combinations is that the least-cost combination is C and D. In the case of a direct comparison with Systems B and C alone, however, System E does have an advantage. In this case, what is implied is that cost penalities be accepted in Mission 2 in order to retain low-cost capabilities in Mission 3. In this COST OF MULTI-MISSION SYSTEMS 445 case, the marginal cost of Mission 2 would be set exactly equal to the price of the alternative system, reducing it from 49 to 45, and the marginal cost of Mission 3 would be raised from 51 to 55, i.e., four units of marginal common cost would be reallocated from Mission 2 to Mission 3. Common Cost Allocation, Time and Discount Rates Investments in military systems, like other public and private systems, have particular useful lives and are subject to particular rates of discount. Exactly how useful lives and discount rates are determined are difficult problems in their own right and will not be discussed here. Suffice it to say that useful lives are functions of wear and tear and obsolescence, and discount rates reflect the terms under which the values of present and future costs and effectiveness are compared. Several measures have been employed for evaluating benefits and costs over time. Although each has its strengths and weaknesses, the best general measure appears to be the net discounted present value of an investment.* This is the measure to be employed for the purpose of illustrating how dis- counting and system life may be handled for multiple-mission systems. First, let us consider a single- mission system. Its net discounted present value (V) is: (12) V=2,idi(pqo-C ), 1 where di is the discount rate in year i, or (1 + r) f where r is the rate of interest in use for military systems and is assumed, for the sake of simplicity, to be constant for all relevant years, t The other variables are as described above. In words, then, V is the sum of a stream of net benefits, each year's entry being discounted by a greater amount than that of the year before.** If we extend this to the case of two multiple-mission systems where the investment costs are com- mon to both systems and have identical useful lives and are subject to the same interest rates, we have (18) V n = 2,idi (piqo + P2<?o — Cf(q ) + w x q + w>q„). The first order condition for the maximization of the net discounted present value of the two missions performed by the system is dq This implies, following Boiteux [2, appendix], (19) Xidi (p, + p 2 ) = Xidt {2x + wi + w 2 ). Since both sides may be divided by 1*idu we see that (19) reduces to Pi + Pi = 2x + w 1 + w-i. *For a comparison of the more prominent measures, see Baumol [1. ch. 19]. tThe interest rate is presumably based on some notion of social time preference or opportunity cost. For a discussion of some of the issues involved see (Prest and Turvey, [7, pp. 697-700]. **The link from "benefits" to expenditure (pq„) in our context is that pq a is the expenditure necessary to meet a particular requirement, and it is the meeting of the requirement that is the benefit. 446 R- t. crow Thus, allocation in this case follows the same lines as (13), (16), etc.; and the interest rate and useful life plays no role. If these are different useful lives of the system in different missions, and/or the interest rate differs, a general solution of the allocation problem has not been found, although solutions have been found for specific cost functions, e.g., linear functions. The problem arises from the translation of units of output of particular missions to units of the multimission systems. The simplicity of the cost-allocation scheme proposed in this paper is due to Boiteux's demonstration that this can be done with great generality when time does not enter the picture. This breaks down, however, where the missions have different useful lives while performed by the multiple-mission system in question or have different interest rates, since it is not generally possible to translate them from specific missions to units of the multiple-mission system. It appears that the basic approach retains its validity, but not its simplicity. CONCLUSION This paper has presented a new technique for allocating the common costs of multiple-mission systems. One major departure from existing practice is that the basis for allocation is to be found in the importance of the missions as reflected in their relative prices or, more generally, on an assess- ment of the relative abilities of a system to carry out alternative missions. The second major departure is that it uses marginal conditions rather than proportional or "either-or" allocations. Thus, unlike existing techniques, it is consistent with the principles of efficient resource allocation. These are not as drastic departures from current practice as they may seem, since some notion of relative importance is implicit in the distinction of major from minor missions in the currently employed two-ship method of allocation. The two-ship method also adheres to a rough approximation to marginal principles. In a very real sense, what has been presented above can be regarded as a generalization of the two-ship method, as well as an explicit statement of the principles underlying it. The major problem in employing the proposed method is, of course, to develop means of measuring the relative prices of different missions supplied by a system. Several tentative suggestions have been offered which, crude as they are, should aid in achieving better allocation of common costs. In all like- lihood better means can and will be devised through experience if the proposed approach is employed. ACKNOWLEDGME NT The author is grateful for discussions and correspondence with A. S. Rhode, J. T. Kammerer, K. F. Linder, Saul Gass, George Taylor, Kenneth Babka and William Baumol. I wish to thank them but also absolve them of any blame for whatever errors or misconceptions remain. REFERENCES [1] Baumol, W. J., Economic Theory and Operations Analysis (Englewood Cliffs, N.J., Prentice-Hall, 1965), (2nd ed.). [2] Boiteux, M., "Peak Load Pricing" (J. R. Nelson, ed), Marginal Cost Pricing in Practice (Prentice- Hall, Englewood Cliffs, N.J., 1964), chap. 4, pp. 59-90. [3] Carlson, S., A Study on the Pure Theory of Production (Kelley and Millman, New York, 1956). [4] Crow, R. T., "The Allocation of Common Costs of Multiple-Mission Systems," a report to Systems Analysis, Chief of Naval Operations, Contract No. N00014-70-C-0086 by MATHEMATICA, Inc., Bethesda, Md. (Nov. 1971). COST OF MULTI-MISSION SYSTEMS 447. [5] Grey, J. C, Cost Analysis Methodology (Fire Support Study Working Paper No. 9), Dahlgran, Va.: U.S. Naval Weapons Laboratory (July 1970). [6] Littlechild, S. C, "Marginal Cost Pricing with Joint Costs," Economic Journal LXXX, 323-335 (June 1970). [7] Prest, A. R. and R. Turvey, "Cost-Benefit Analysis: A Survey," Economic Journal LXXV, 683-735 (Dec. 1965). AN EXPLICIT GENERAL SOLUTION IN LINEAR FRACTIONAL PROGRAMMING* A. Charnes Center for Cybernetic Studies University of Texas W. W. Cooper School of Urban and Public Affairs Carnegie-Mellon University ABSTRACT A complete analysis and explicit solution is presented for the problem of linear frac- tional programming with interval programming constraints whose matrix is of full row rank. The analysis proceeds by simple transformation to canonical form, exploitation of the Farkas-Minkowski lemma and the duality relationships which emerge from the Charnes- Cooper linear programming equivalent for general linear fractional programming. The formulations as well as the proofs and the transformations provided by our general linear fractional programming theory are here employed to provide a substantial simplification for this class of cases. The augmentation developing the explicit solution is presented, for clarity, in an algorithmic format. I. INTRODUCTION The linear fractional programming problem arises in many contexts with relatively simple con- straint sets, e.g., in the reduction of integer programs to knapsack problems, in attrition games, and in Markovian replacement problems as Well as in Neyman-Pearson rejection region selection problems. Illustrative examples are provided by G. Bradley [5] or F. Glover and R. E. Woolsey [12] t, J. Isbell and W. Marlow [13], C. Derman [10], and M. Klein [16]. The linear fractional programming problem in all generality, and with all singular cases considered, was reduced in [8] to at most a pair of ordinary linear programming problems. This immediately made available all of the algorithms, interpretations, etc., that are associated with linear programming. This includes, we should note, access to any ordered field,** and any of the algorithms and the com- puter codes for linear programming problems which, by virtue of [8], thereby also become available for any problem in linear fractional form. Thus, with the development in [8], the work in linear frac- tional programming took a different form from its previous sole concern with the development of special types of algorithms for dealing with this kind of problem. This research was partly supported by a grant from the Farah Foundation and by ONR Contracts N00014-67-A-0126-0008 and N00014-67-A-0126-0009 with the Center for Cybernetic Studies, The University of Texas. This report was also prepared as part of the activities of the Management Sciences Research Group at Carnegie-Mellon University under Contract N00014- 67- A-03 14-0007 NR 047-048 with the U.S. Office of Naval Research. Reproduction in whole or in part is permitted for any pur- pose of the U.S. Government. t See also E. Balas and M. Padberg [1]. ** See, e.g., the development of the opposite sign theorem and related developments in [7]. 449 450 A. CHARNES AND W. W. COOPER In the present paper, we apply our reduction, as given in [8], to a general class of linear frac- tional problems, viz., those for which the constraint set is given by (1.1) a s= Ax =£ b so that this part of the model is in "interval programming" form.* Here we shall assume that the matrix A is full row rank and the vectors a, b, and x meet the usual conditions for conformance. This means that the constraint set is a parallelopiped. See the Final Appendix in [7]. Subject to conditions (1.1), we wish to (1.2) N(x) c T x + Co maximize R(x) = ^ , . = -t= — ; — r ^ constant, D(x) d'x + do so that we are now concerned with a problem of linear fractional programming. Because A is of full row rank it has a right inverse, A*, and hence we can write (1.3) Now, setting or where (1.4) and we obtain subject to (1.5) AA* = I. y—Ax x = A*y+Pz, P = I-A*A z is arbitrary, _ c T A*y + c T Pz + c max R (y ' z) = My + dTPz + dl a^y =£ b in place of (1.1) and (1.2). Because z is arbitrary, ** unless (1.6) c T P=d T P = *See[2H4]and[18]-[19]. **Observe that we have ruled out the case in which R is identically constant in (1.2). LINEAR FRACTIONAL PROGRAMMING 451 we shall obtain max R = °°. In order to avoid repetitious arguments, however, we defer the proof of this until we have discussed the situation c T P = d T P = 0. See section IV, below. Waiving this consideration, we shall next proceed to solve this problem explicitly and in all gener- ality by means of the following three characterizations: The denominator D(x) is (1) bisignant, (2) unisignant and nonvanishing, or (3) unisignant and vanishing on the constraint set. In (1), i.e., the bisig- nant case, we shall show that R(x) = °°. Furthermore, we shall show how to identify this case at the outset so that it may be discarded from further consideration. This will leave us with only cases (2) and (3) to examine where we shall proceed to transformations from which a one-pass numerical com- parison of coefficients makes explicit the optimal value and solution. After this has all been done, we shall then return to assumption (1.6) in a way that utilizes the pre- ceding developments. Finally we shall supply numerical examples to illustrate some of these situations and then we shall draw some conclusions for further research which will return to the remarks at the opening of this section. II. BISIGNANT DENOMINATORS Employing assumption (1.6), our problem is c TA*y + c (2.1) maxR{y)= dTA*y + d Q ' subject to in place of (1.5). Note, however, that here, and in the following, we shall slightly abuse notation by con- tinuing to use the symbols R, N, D, as in (1.1) and (1.2) even though we mean the transformed function, as in (2.1). V A Let D, D denote the maximum and minimum, respectively, of D over the constraint set. We note: LEMMA 1: (a) D is bisignant if and only if D> and D< 0. (b) D is unisignant if and only if either D =£ or D 2* 0. In terms of y, since we can choose each component of y independently (see (1.5)), we can express D A and D immediately as D = X d'jbj + X <*'&+ do - max D(y) (2.2) a + D = 2 d'jaj + X d'jbj + do = min D(y) , + where d) = (d T A)j is the jth element of d T A* and "+" or "— " indicates that the summation is on only the positive or negative dj. Let us first consider the bisignant situation. If we make the transformation of variables yj -aj= kj£j, dj 2* (2.3) bj-yj = kjCj, dj < 0, 452 A CHARNES AND W. W. COOPER where the kj > will be suitably chosen, the constraints transform to < kjCj ^ bj - aj (2.4) or 0^^ = ^=^. Without loss of generality, the 8j are positive, since otherwise £j = and does not enter into the optimiza- tion. By choosing the kj suitably, we obtain the form ^yjCj + yo (2.5) * (c) = sc, + 2&-i- + Note that 2^8j + 2j8j> 1, since otherwise D would not be bisignant. + One of the following two cases must now hold: CASE (i): for some £, =£ £ s= g, such that D(l) = Owe have N(l) * 0, or else CASE (ii): for every £, «= £ « 8, such that D(0 = Owe have N(Q=0. In Case (i), since N(£) is continuous, there is a neighborhood of £ in which /V(£) is unisignant. Since (2.6) ^l J + ^l J =K^8 j +^8 } , + + and *£ £j *£ 8j in the constraint set, we can choose €j 2 s 0, V €j > 0, so that =£ £ ± e ^ 8, sgn jV(£ J + e) = sgn N(£ — e) and /)(£ + e) > >/)(£ — e). By approaching £ along the line segment from one of £ + €,£ — €, we can make/? (£) - * °°. In Case (ii), we must have yj = for all j such that d j = 0. For D((,) =0 involves specifying only the Cj f° r dj > and dj < 0. If yj # for dj — 0, then having made #(£) =0, we can change the value of /V(£) by changing £ J() . Thus /?(£) =0 would not imply /V(£) =0. We therefore drop the "+", " — " notation is considering Case (ii) and rewrite it as V VjCi + To = whenever ^ £j =£ 8j and SCi=i. By letting y,= £jy , Jo 3* 0, this becomes ±j V 7/30 + yoyo) > 0, whenever 5>j- yo = o LINEAR FRACTIONAL PROGRAMMING 453 (2.7) -yj + Sjyo^O yj 3*0 y„2=0. Note that the implication extends to yo =0 since yo = implies all yj = 0. We now apply the Farkas-Minkowski Lemma* to the pair of implications in (2.7) and obtain (2 - 8) yo=-^+2 8 ^j + l/ o; 0$.">n^°. ^ j for the first implication, viz., I ^ 7jyj+ yoyo I 2*0. For the second one viz., (^ Tj-yj+yoyo) 2=0, we j J obtain, f- %=/*"■ -flr+jy (2-9) ( — yo = — /a" + ^) 8,0/ + i>6"; 0j", fj~, i>6~ 2* 0. j Adding the first expressions in (2.8) and (2.9), (2.10) or Adding the second pair = - (fi+ + fi~) + £ 8j(0+ + dy) {vS + v~ ) j (2.11) or / Lt+ + / L t -=2o J (0; + 0j) + *++»v i Since each term on the right is nonnegative, we have (2.12) fjL+ + tJL-^0. Next, substituting from (2.10) into (2.11), we get o=- (fi++fi-) + £ 8,0*++/*-) + £ 8»(»?+»r)+ >tf +"o (2.13) or o=(^8 j -i) ( M + + M -) + 2v; + 2 Vj + ^ + iv Since the right-hand side is a sum of nonnegative terms, each of these must be zero. Moreover, /Lt+ + /x- = since^8 j -l>0 *See Appendix C in [7]. 454 A - CHARNES AND W. W. COOPER (2.14) p+ = vj = since 8j>0 By virtue of (2.14), and going back to (2.10), (2.15) 0+ + 0j = 0. Further, with 0t, $j 2* 0, we must have Of = 6t = for ally. Therefore, yj = /x + , for ally, and y — — fi+, so that we have ^yjCj + yo ^ + (2^-i) (2.16) R(0 = J = = fx + = constant. In other words, Case (ii) can only occur in the trivial instance where the numerator is a constant multiple of the denominator. In this case, each coefficient in the numerator is the same multiple of the corre- sponding coefficient in the denominator, and this would have to be true in the original N(x), D(x) description and hence obvious upon comparing the initial coefficients. Since we have ruled out this very obvious case (see (1.2)), we have only max R(x) = °° when D(x) is bisignant on the constraint set. III. UNISIGNANT DENOMINATORS V The unisignant cases now remain to be considered. If D =£ we multiply both W and D by —1, (thus not altering the value of R) and we are then reduced to "D" 5* 0. With this normalization, we make a transformation of variables as in (2.3), yj-aj = gj£j, dj^O (3.1) bj-yj = gj^j, dj<0, where the gj > will be suitably chosen. The constraint set will now be bj — aj (3.2) 0<6*£8,= Si and, first considering the case where D > 0, the gj can be chosen so that the problem is (3.3) maxR(0= j '' , 0^&*£6> LINEAR FRACTIONAL PROGRAMMING 455 In (3.3) the summation is only over "+" and "— " because, the denominator being positive, optimal values for the £, such that dj — can be specified as £/ = when jj < 0; gj= 8, when jj 2* 0, and these new constant terms are assumed to be already contained in y . By the reduction that we gave in [8], however, the equivalent linear programming problem is max ^yjVj+yo'no, j subject to £ TJJ+ 7)0=1 (3.4) 7,^0, where we can also note that these constraints imply rjo > 0. The dual to (3.4) is mm u subject to (3.5) U + (t)j 3= Jj u-^8ja>j=y j We shall employ this dual in an essential manner to obtain our desired one-pass argument for obtaining an optimum. At each step in the procedure, we shall have a solution to a less restrictive problem than the dual problem and an associated primal feasible solution. Suppose the y's are renumbered so that y\ 3* y% 3* . . . y n ; Then Case (i) yo 3= Ji has the immediately obvious primal solution tj* = 1, 17* = 0, and max /?(£) = yo. In the contrary case, Case ii, yi > • - . 2 s y P > yo 2* y P+1 3= . . . 3= y„, we build up an algorithm based on the dual problem in which we choose u q at the gth step to satisfy u« + o>J=yj, 7=1, • • • , q (3.6) u«-%8.<o«=y . Using the first q equations to obtain w? in terms of « 9 and substituting in the last equation, we obtain 456 A CHARNES AND W. W. COOPER (3.7) "*(i+is*)= IrA + yo and hence (3.8) u«= -> 1 + lSj i Thus, u q is a convex combination of yo, J\, . . -,y q with proportionahty constants 1, Si, . . ., 8 q . Note that if u Q *£ y 9 , then u q , w9, j= 1, . . ., q, satisfy the first q constraints plus the "yo" constraint of the dual problem; hence satisfy a less restrictive problem than the dual. If we take n. f =i / ( i+ i 8 ^) (3.9) and (3.10) v]=8j J (l+£sj), j=l, ■ • ., q i7?= 0, j>q, then t/9, y = 0, . . ., n is a feasible solution to the primal problem and y T r) Q + yor) q = u q (by substitution in (3.4) and comparison with (3.8)). Hence, whenever we can get u Q , o>? feasible for the dual problem, we will have a primal feasible solution for tj q with the same functional value and thus we will have an optimal pair of dual solutions. This, plus the equivalences maintained via (3.1) and our theory from [8], thus justifies the develop- ment that we detail as follows: To start, (3.11) u 1 =(y +y 1 8 l )l(l + 8 1 ). (Note, u 1 > yo since yi > yo and u' is a proper convex combination of yi, yo.) u 1 ?^? We check: Is If yes: we are done: coj = <u* |0, j > 1 and If No: then LINEAR FRACTIONAL PROGRAMMING 457 ^=1/1 + 8, 17» = 7)* = 81/1 + 6 , i7] = T,* = o,y>i. "' < 72, yo + y,8, + y 2 8 2 1 + 8, + 8 2 /yo + y.8i \ / 1 + 8. \ / 8 2 \ V 1 + 8, Al + 8 1 + 8./^\l + 8, + 8 2 / r2 ,/ 1 + 8, \ / 8 2 \ = "ll + 8, + 8 2 j + ll + 8 1 + 8 2 j y2<y2 ' since u 1 < y 2 . Next, Is u 2 5* y 3 ? If yes: We are done with the substitutions indicated by (3.6). If No: u 2 < y 3 and we continue to « 3 . This process must stop by u p at the latest since 7o+2)yj8j > y ^y P +i 3= • • • ^y«- 1 + 1)8; 1 Thus max R(i) = u s where 5 is the least positive integer such that u s 3= y g+l , and s yo+2 yj fi j i + S«i This concludes the case D > 0. A. CHARNES AND W. W. COOPER ng case has D = 0. By making a transformation of variables as in (3.1), and choosing the gj > suitably, we obtain i We may dispose of two situations immediately (i) y > 0: then /*(£)-»<» as & -> 0. (ii) To— 0: here the dual problems are max V jj-qj min a i n with V tjj = 1 with u 4- ojj 5= y,- i 77,-8,1)0 =s0 -^Sjtoj^O i Tjj, rjo ^ ajj 2= 0. tion pair is 0).*= 0, tt= yi = maxy,, and 7j*=l, tj* = 0, _/ > 2. The maximum of /?(£) is thus yi. The remaining instance is (iii) y> < : y, s* • ■ • 5* y„ 3= y 2= y P +i 3* ■ ■ 3* y„. The dual linear programs can now be written 1 max ^ y/nj + y i?o min u j with V t/j = 1 with u + to, S 5 y, j j 7), S= W, ^ 0. As before, we define LINEAR FRACTIONAL PROGRAMMING 459 This yields It may be easily verified that «'=(iyA+*)/i«*. „,= *»-.) (.-L_! )+(-A. )y q , j 1 or, what is the same thing, vfl is a proper convex combination of u ( « _1) and u". Thus, if u r <y r+u then u r < u r+1 < y r+ i as in our earlier argument for Z) > 0. The steps of our process are as before: if u q 5* y q +i we are done; otherwise, we test « 9+1 against y</+2. At worst we are done with ■— (2^+u»)/i%. As before, u*= max R(£), where s is the least positive integer such that u 8 2* y»+i. IV. RCr, *) Returning to (1.5) we consider the remaining cases in which either (a) dTP = 0, c T P¥=0 (4) or (b) <FP * 0. In case (a), since z is arbitrary we can make ± c T Pz — * °°, hence max R(y, z) — » °°. In case (b), we are in the bisignant denominator situation since we can make d T Pz -* ± 00. The argument of the bisignant section of this paper (with the additional variables, z) now shows that max R = °° since we have ruled out, a priori, the case in which R = constant. V. EXAMPLES Some examples may help to fix and sharpen some of the preceding developments. Thus consider _. 3*i — *3 + 4 max R(x) — with -1 ^x l + x 3 ^2 (5.1) 1 *£ x 2 *£ 5, 460 A CHARNES AND W. W. COOPER and the variables Xu x 2 , x 3 are otherwise unrestricted. Here we have,* (5.2) A = 1 1 1 9 A* = 1 1 P = (I-A*A)- 0- -r To exhibit the development in full detail we next write c r A*y= (3,0,-1) 1 1 (£)-<»• •>(£)-* c =4 (5.3) c T Pz= (3, 0,-1) -1 c T A*y= (0,2,0) 1 0] 1 z,\=(0,0,-3) lz x 2 2 J I 22 Ml \ Z 3 (£)-<»■»>■(£)-* = — 323 d'Pz = (0,2,0) 1 2, \= (0,0,0,) /z, 22 I I 2 2 Ml \*8 = do=0. Evidently in this case d T P — 0, as witness the next to the last expression. On the other hand, c T P ^ 0, as witness c T Pz — — 3z 3 in the second expression for (5.3). Hence condition (a) of the preceding section obtains and we have R —> °° even though (5.4) - 1 s£ y, s£ 2 1 =S y 2 =£ 5. This occurs because z is arbitrary and can be freely chosen in (5.5) max R(y, z) c T A* y.+ c r Pz + Co _ 3y, - 3 Z3 + 4 d T A#y+d T Pz + 2y, l !t may be observed that il" l*'!i«ii'<- is no! unique. LINEAR FRACTIONAL PROGRAMMING 461 which is the specialization of (1.5) to this case. Of course, the result R — > °° in (5.1) can be confirmed by direct inspection, since negative values of z 3 may be selected along with increasingly positive values of %\ as required in order to maintain the first interval programming constraint. This last remark suggests that an adjunction such as (5.6) «£ x 3 *£ 1 will convert (5.1) to a problem with a finite maximum. This yields an A for an interval programming format with the full row rank condition fulfilled as in (5.7) A = 1 o l- 1 -1 1 , A# = 1 1- -0 1 On the other hand, A is also of full column rank so that we also have A* =A~ 1 and P = I-A#A = 0. Hence both of the conditions specified in (1.6) are fulfilled, viz. c rp = d rp = for any c T and d T . The problem to be solved is now written max R(y) _ c T A* y+ Co _ yi — 4y 3 + 4 (5.8) with d T A*y+d -l^yi 1 *£ y 2 s£ 5 ^ y 3 *£ 1. 2y, Evidently the solution to this problem is y* = 2, y* = 0, and y* = 1 so that max R (y) = — = 5. To obtain the corresponding components of x we simply utilize (1.4) with P = to obtain (5.9) Xl \ 1 -1" / 2 \ ( 2 x 2 \ = A*y = 1 -v x 3 J .0 1j Vo/ Vo 462 A - CHARNES AND W. W. COOPER As may be seen, these x values satisfy (5.1) with (5.6) adjoined. They are evidently also maximal with R(x) = 5 since x 3 can no longer be negative and x 2 and x\ are at their lower and upper limits, respectively. In some cases the solutions, as above, may be obvious but, of course, this cannot always be ex- pected. Recourse to the preceding development, however, will produce the wanted results in any case, however, as we illustrate by now developing the above examples, along with the related back- ground materials, in some detail as follows: Because the denominator is unisignant we utilize section III. Observing that dj = d 2 = 2 in the denominator and hence is nonnegative, we have recourse only to the first part of (3.1) in order to write y\ — «i = gi£i 6.1) y 2 — a 2 = g 2 t; 2 J3 _ a 3 = ^3^3, where, respectively, ai = — 1, a 2 = 1, and a 3 = 0, via (5.8). The development from (3.1) to (3.2) applies to this case as, with ft ft ft g2 0^3^ S 3 -— — - ^ The insertion of (6.1) into the functional then produces (6.3) 3(ftgi + ai)-4(ft& + a 3 )+4 = 3g,g, - 4gfr + 1 2fa6 + at) 2(ft6 + D via (5.8). Choosing (6.4) ft =1/2, «=1, ft =1/2 and setting (6-5) y. = 3/2, y 2 = 0, y 3 =-4/2, y =l/2 LINEAR FRACTIONAL PROGRAMMING 463 gives the denominator form wanted for (3.3) as: 2?i 9 2 max/?(£) = + t with (6.6) s= £ 2 =£ 4 Q =?£&*£ 2. The transformation £j = t7j/t}o from our previously developed theorem [8] then produces the following example for (3.4): 3 Vo max - i?i + Ot7 2 — 2 173 + — with (6.7) 171 + 172 + 173 + 170=1 171 —6170^0 172 — 4i7o s S 173-2170^0 171, i? 2 , 173 2*0, where the gj values of (6.4) combine with (6.2) to give 81 = 6, §2 = 4 and 83 = 2, as required for the application of (3.4). The corresponding dual, which our previous theory also gives access to, is min u with 3 11 + (Oi 2* - (6.8) U + 0>2 2*0 U + G>3 3 s - 2 u — 6o>i — 4o>2 — 2co3 = — (ti\, Ct>2, CO3 2 s 0. 464 A - CHARNES AND W. W. COOPER This, of course, is the application of (3.5) to the present example. Since yi = 3/2 exceeds yo == 1/2 we are in the situation of case (ii) following (3.5). Thus, preserving our subscript identifications from (6.5), we have (6.9) y, >y 2* 72 2* 73 in our present situation. We therefore see that the first application of the suggested algorithm should A suffice. (See the remarks which conclude the case D > in section III.) Applying (3.11) now produces (6.10) u l = (1/2 + 3/2 -6)/(l + 6) = 19/14>yo=l/2. Evidently this also formally satisfies the condition that u l equals or exceeds the immediate successor of yi is (6.9). Hence, we have u l = u*= 19/14 (6.11 ) u>\ = <o* = y i -u* = 3/2 - 19/14 = 2/14 (li\ = (O* = (ti\ = &>* = , which satisfy the constraints of (6.8), as may be verified, with min u= u* — 19/14. Moving to the primal problem via (3.9), (6.1) y\l = *?} = l l l 1 + 8, 1 + 6 7 6. 6 6 1 + 8, ~ 1 + 6 ~ 7 and all other tj] = 0. See (3.10). Inserting these values for the corresponding t/j in (6.7), we see that all constraints are satisfied with 3/2 tj, + tjo/2 = 3/2 • 6/7 + 1/7 • 1/2 = 19/14, the same as the value of u*, thereby confirming optimality. In fact, as our theory [8] prescribes, we need merely apply the expressions 7,0= 1/7, ^ = 7,,/tjo = 6/7 + 1/7 = 6 LINEAR FRACTIONAL PROGRAMMING 465 with all other t7j = and then reverse the development from (6.6) to (6.7) in order to verify that this value is also optimal for with (6.13) vt t\ 3/2-6+1 /2 ..... max R (£) = = 19/14 0*££> = 0^- — -=4 gl 0=s £, = ()<- = 2. ga Evidently we can now directly effect substitutions in (6.1) and obtain Yi = git;i + ai = 3 — 1 = 2 (6.14) j&=&6 + a,=0+l = l y 3 = g 3 &+ 03=0+0=0. Then we can proceed exactly as in (5.9) to obtain the values *i = 2, Xz=l, x^ — O which we previously observed to be optimal. SUMMARY Although the development in this paper proceeded, for clarity, by algorithmic format, we sum- marize below the explicit solution in tabular format for direct theoretical interpretation and utilization. Summary of Solution Denominator Transformed numerator R* Bisignant Unisignant: (a) positive (b) nonnegative •y„>0 yo<0 00 f yo+2y*8* u' — , least 5 with u* g y„ + 1 1 u* = , least with u" g >»+ 1 466 A CHARNES AND W. W. COOPER CONCLUSION Before proceeding any further we should probably point up, again, the crucial role played by the general theory (including the transformations and proof procedures) which we introduced in [8] for making explicit contacts between linear fractional and ordinary linear programming, in all generality and exact detail. These transformations are also utilized in the present paper and the theory is also extended via the duality (and other) characterizations given in the preceding text. These are joined together here for the proofs in algorithmic format kind we have just illustrated by example and com- mentary. Other uses can undoubtedly also be made of this theory and the preceding extensions via the passage (up and back) between linear fractional and ordinary linear programming that is now possible. Our general theory has been used by others, too, to extend or simplify parts of linear fractional programming en route to effecting the contacts with ordinary linear programming that are thereby obtained. The work of Zionts [22] should perhaps be singled out as being most immediately in line with the R = °° results presented in this paper. Zionts' development is directed only toward simplifying matters by focusing on eliminating cases for linear fractional programming which are either deemed to be unwanted or of little interest for practical applications. The developments cease as soon as the contacts with ordinary linear programming are identified via our theory, which he like others utilizes for this purpose. We have effected the developments in this paper in a way that makes contact with interval linear programming.* An opening for further two-way flows is thereby also provided. The resulting junctures should also help to guide subsequent developments in the more special situations that now seem to invite consideration in the future. Finally, the possibilities for dealing with specially structured problems (such as those observed at the start of the present paper) should also be observed explicitly in this conclusion, partly because the theory we have now developed and presented should also be a helpful guide to these additional cases which are important in their own right. Thus we can now conclude here by referring to our opening remarks. ACKNOWLEDGMENT We wish to thank W. Szwarc of the University of Wisconsin for comments which helped us to improve the exposition in the manuscript for this article. BIBLIOGRAPHY [1] Balas, E. and M. Padberg, "Equivalent Knapsack-Type Formulations of Bounded Integer Pro- grams," Carnegie-Mellon University (Sept. 1970). [2] Ben-Israel, A. and A. Charnes, "An Explicit Solution of a Special Class of Linear Programming Problems," Operations Research 16, 1166-1175 (1968). [3] Ben-Israel, A., A. Charnes, and P. D. Robers, "On Generalized Inverses and Interval Linear Programming," Proceedings of The Symposium on Theory and Applications of Generalized Inverses, held at Texas Technological College, Lubbock, Tex. (Mar. 1968). [4] Ben-Israel, A. and P. D. Robers, "A Decomposition Method for Interval Linear Programming," Management Science Vol. 16, No. 5 (Jan. 1970). [5] Bradley, G., "Transformation of Integer Programs to Knapsack Problems," Yale University, Rept. No. 37 (1970). To appear in Discrete Mathematics. •See [2], [3], [4]. LINEAR FRACTIONAL PROGRAMMING 467 [6] Chadda, S. S., "A Decomposition Principle for Fractional Programming," Opsearch 4, 123-132 (1967). [7] Charnes, A. and W. W. Cooper, Management Models and Industrial Applications of Linear Programming (John Wiley & Sons, Inc., New York, 1961). [8] Charnes, A. and W. W. Cooper, "Programming with Linear Fractional Functionals," Nav. Res. Log. Quart. 9, 181-186 (Sept.- Dec, 1962). [9] Dorn, W. S., "Linear Fractional Programming," IBM Research Report RC-830 (Nov. 27, 1962). [10] Derman, C, "On Sequential Decisions and Markov Chains," Management Science, 9, 16-24 (1962). [11] Gilmore, P. C. and R. E. Gomory, "A Linear Programming Approach to the Cutting Stock Problem," Operations Research 1 1, 863-888 (1963). [12] Glover, F. and R. E. Woolsey, "Aggregating Diophantine Equations," University of Colorado Report 70-4 (Oct. 1970). [13] Isbell, J. R. and W. H. Marlow, "Attrition Games," Nav. Res. Log. Quart. 3, 71-93 (1956). [14] Jagannathan, R., "On Some Properties of Programming in Parametric Form Pertaining to Frac- tional Programming," Management Science, 12, 609-615 (1966). [15] Joksch, H. C, "Programming with Fractional Linear Objective Function," Nav. Res. Log. Quart. 77,197-204(1964). [16] Klein, M., "Inspection-Maintenance-Replacement Schedules under Markovian Deterioration," Management Science 9, 25-32 (1962). [17] Marios, B., "Hyperbolic Programming," translated by A and V. Whinston, Nav. Res. Log. Quart. 77,135-155(1964). [18] Robers, P. D., "Interval Linear Programming," Ph.D. Thesis submitted to Northwestern Uni- versity (Evanston, 111.) Dept. of Industrial Engineering and Management Sciences (1968). [19] Robers, P. D. and A. Ben-Israel, "A Suboptimization Method for Interval Linear Programming," Systems Research Memo No. 206, Northwestern University (Evanston, 111.), The Technological Institute (June 1968). [20] Swarup, K. "Linear Fractional Functionals Programming," Operations Research 13, 1029-1036 (1965). [21] Wagner, H. M. and J. S. C. Yuan, "Algorithmic Equivalence in Linear Fractional Programming," Management Science 14, 301-306 (Jan. 1968). [22] Zionts, S., "Programming with Linear Fractional Functionals," Nav. Res. Log. Quart. 75, 449-452 (Sept. 1968). USING DECOMPOSITION IN INTEGER PROGRAMMING Linus Schrage Graduate School of Business University of Chicago ABSTRACT When implicit enumeration algorithms are used for solving integer programs, a form of primal decomposition can be used to reduce the number of solutions which must be im- plicitly examined. If the problem has the proper structure, then under the proper decomposi- tion a different enumeration tree can be defined for which the number of solutions which must be implicitly examined increases with a power of the number of variables rather then exponentially. The proper structure for this kind of decomposition is that the southwest and northeast corners of the constraint matrix be zero or equivalently that the matrix be decomposable except for linking columns. Many real traveling salesmen, plant location. production scheduling, and covering problems have this structure. INTRODUCTION Consider an integer program for which the constraint matrix has the form shown in Figure 1. The important feature is that the columns of A% link two otherwise independent subproblems. In general, a set of columns is a Unking set if deleting that set partitions all remaining columns A, Az A, Figure 1. into two disjoint, nonempty sets, A \ andA 3 , such that there do not exist two columns, one in^i, and one in A 3 , both having a nonzero entry in the same row. For simplicity, assume that all variables are required to be either or 1, and that there are n variables in each of the blocks A U A 2 , and A 3 . There are then 2 3 " solutions to be implicitly examined. FORM OF THE DECOMPOSITION We assume that an enumerative scheme similar to that described in Geoffrion [4] is to be used. We can think of A 2 as being the master problem. If we fix the values of the variables in block A 2 to some set of values, then the variables in blocks A x and A 3 constitute independent subproblems and can be solved separately. Solving one of these problems (Ai or A 3 ) requires us to implicitly examine 2" solutions. Each of these two problems must be solved for each possible setting of the variables in At. Therefore, we must effectively enumerate 2"(2" + 2") =2 2n+1 solutions. If we define a node to be the setting of some variable to one of its two values, then perhaps a better measure of computational 469 470 L. SCHRAGE difficulty is the number of nodes in the enumeration tree. The nondecomposition complete enumeration tree has approximately 2 3 " +1 nodes in it. By use of decomposition the complete enumeration tree has 2n(2»+i + 2»+i)+2" +1 = 2 2n+2 + 2" +1 nodes. In this sense, decomposition reduces the difficulty of the problem by a factor of approximately 2" _1 - EXISTENCE OF THE DECOMPOSITION STRUCTURE One can argue that this stairstep structure exists in several classes of integer programs. In a travel- ing salesman problem based on the United States, the variables in Az might correspond to the choice of arcs connecting cities in the Midwest. For each set of these arcs chosen, one should be able to com- plete the eastern and western legs of the tour independently. Similar arguments can be given for the plant location problem based on cities in the United States. The variables in A-i would correspond to the decisions of which plants to build in the midwest. For each set of midwest plant decisions one would expect to be able to solve the eastern and western plant loca- tion problems independently. The Wagner-Whitin [10] dynamic lotsize problem is a special case of the plant location problem which has an even more obvious decomposition structure. Once we decide to produce in a particular period, the production plans for subsequent and for previous periods can be solved independently. By discarding variables which obviously cannot be in the optimal solution, the 12-period problem given in Wagner and Whitin [10] can be decomposed into a stairstep structure with seven blocks. Another class of integer programs are covering problems. One situation giving rise to covering problems is in the assignment of vehicles to routes. Each city in the service area must be "covered" by at least one route. Arguments similar to those for the traveling salesman and plant location problems can then be made. GENERALIZATION TO MORE THAN THREE BLOCKS Consider a constraint matrix with the stairstep structure shown in Figure 2. Again, for simplicity, we assume the problem is composed only of 0-1 variables. 1 1 A 2 1 A 3 A 4 A„-i I A K Figure 2. This problem can be decomposed into a hierarchy of masters and subproblems. Assume there is a total oik blocks. Choose as the highest level master, block number [k/2]+ 1, where [x] is the greatest DECOMPOSITION IN INTEGER PROGRAMMING integer no larger than x. This divides the problem into two independent subproblems with [A/2 [(k — l)/2] blocks each. These two subproblems can themselves be decomposed in similar If this decomposition is carried on in this recursive fashion we will then have approxinu levels of decomposition. In doing the enumeration we will first set the variables in block [A/! For each setting of these variables we must solve the independent subproblems composed o 1 to [k/2] and blocks [A-/2] +2 to k. The first of these subproblems is solved by first setting the variables in block [[A/2]/2] + 1. This divides the problem further into independent subproblems. This enumera- tion method is applied recursively to each independent subproblem created. The result is a binary tre< of independent subproblems. For simplicity assume that each block contains n 0-1 variables. Let 5 (jf) be the number < nodes in a decomposition enumeration tree with j levels. If we increase the number of blocks such that the number of levels of decomposition increases iromj toy+ 1, then each of the 2" solutions to est level master partitions the remainder of the problem into two^ level problems. We then have that s(j + 1)=2" • 2 • s(j), where s(l) =2". Thus, s(j)= (2" +1 ) j/2 and the number of terminal nodes decomposition tree of a problem with k blocks is (1/2) (2" +1 ) log2fc = (1/2)A" +1 . A perhaps more accurate measure of the size of the enumeration tree is the total number of nodes, terminal and intermedial* in the tree. Let t(j) be the total number of nodes in the decomposition enumeration tree withy level of decomposition. The total number of nodes in a simple binary tree with 2" terminal nodes or approximately 2" +1 . We can now argue as before the claim that approximately t (j + 1 ) + 2" + 1 = 2 n+1 (t(j) + 1). Now,t(l) is approximately 2" + 1 Thusf(» is approximately ^ (2" +, )'=[2" +I -(2" +1 ) J ' +, ]/[l-2" +l ] which is approximately (2" +1 )-> for n andj large. Thus, the total number of nodes in the dec tree of a problem with k blocks is approximately (2" +1 ) l0S2 ' : — k n+i . We now see that the solutions and the total number of nodes which must be implicitly examined increases with a powei of the number of variables if the block size remains constant as the problem size is increased. Th number of solutions which would have to be implicitly examined under no decomposition the number of nodes in the tree under no decomposition is 2 A " +1 — 1. For example, if n= 10 and k= 3, then decomposition decreases the size of the tree by a factor of about 500. If A- =7 the factor is about 10 ! \ Edmonds [3] made the interesting suggestion that a problem be considered tractable if and only if one can exhibit an algorithm for its solution whose running time is bounded by a polynomial in the size of problem. There is no known algorithm for general integer programs which is polynomial hounded. In fact, Karp [5] gives "theorems which strongly suggest, but do not imply, that these problems, as well as many others, will remain intractable perpetually." It is interesting therefore that the class posable integer programs just described can be solved by a polynomial bounded algorithm, namely, simply searching the decomposition enumeration tree. The Wagner-Whitin [10] example problem, for example, is a 0-1 problem with 12 variables. The first variable is required to be 1 so there are 2 n = 2,048 feasible solutions to the problem and 4,095 nodes in the simple enumeration tree. Using the seven block decomposition mentioned earlier equivalent of only 96 solutions need be examined implicitly. The number of nodes in this tion tree is 191. The computation involved in an enumerative algorithm should he appn 472 L. SCHRAGE proportional to the number of nodes examined. Complete enumeration of the 191-node decomposition tree in this case is not an unreasonable solution method. MORE THAN TWO BLOCKS PER DECOMPOSITION In the analysis thus far we have assumed that a linking block decomposed a problem into two independent subproblems. Much the same analysis could be done if a set of linking columns de- composed a problem into more than two independent subproblems. See, for example, Figure 3. A 2 A„ A, Figure 3. For each of the settings of the variables in A t which must be examined, the subproblems A 2 , A 3 ,A 4 , andA 5 can be solved independently. PROBLEMS WITH LINKING CONSTRAINTS The more common form of decomposition structure discussed in linear programming literature involves a set of submatrices with no rows in common, except that there is a set of constraints at the top linking all the submatrices together. It may be that problems which are formulated with that structure actually have a structure like Figure 2. The rows in common between /fj and A-i, A3 and A4, etc. could be moved to the top and then one would have the more common form of decomposition structure. The decomposition approach described may also be useful for problems where the natural formula- tion is with Unking constraints by realizing that a linking constraint can be replaced by a linking column and two nonlinking constraints. Consider the Unking constraint: 2n Suppose that the problem is decomposable into two subproblems, each with n variables, except for this constraint. This Unking constraint can be replaced by one linking variable, y\, and two nonUnking constraints as follows: ^ajxj-y = 2n y+ X a i x v =b - j=n+l Assume again that the Xj must be either or 1 and all the a,- are integer. In the worst possible case we must examine 2" different values for y, and the size of the tree under decomposition actually doubles. DECOMPOSITION IN INTEGER PROGRAMMING 473 We would expect the number of different values of y to be examined to be much less than 2". In a covering problem, for example, the a/s are either or 1. Then, in the worst possible case we must ex- amine n different values of y. The number of terminal nodes in the decomposition tree is then n(2"+2") = /i2" +1 versus 2 2n terminal nodes in the nondecomposition tree. If n = 20 for example, then the number of terminal nodes is reduced by a factor of approximately 26,000. If one considers the usual decomposition structure studied in linear programming where there are p independent subproblems row-linked by a single master block with q rows, then the interested reader should be able to convince himself that the proper generalization of the transformation described in the previous problem is to reformulate the problem as a multilevel decomposition problem with p — 1 sets of linking columns and approximately log2 P levels of decomposition. Each linking set would consist of q columns. MIXED VARIABLES CASE We have considered only pure 0-1 integer programs thus far. The analysis generalizes fairly natu- rally to the case where some of the variables may take on any integer value. The analysis also extends to the case where some of the variables are not required to be integral if fixing the integer variables in a block completely implies values for the continuous variables in that block. DISADVANTAGES OF THIS DECOMPOSITION METHOD AND THEIR PARTIAL ALLEVIATION A disadvantage of this decomposition method is that the order in which variables are placed in the enumeration tree is partially specified beforehand. The importance of flexibility in the tree search, especially in specifying the order of addition of variables to the enumeration tree has been pointed out [2], [4], [6], [7], [9]. For example, if there is a variable which can take on the value of either or 1 in the optimal solution (i.e., there are alternate optima), then the amount of searching required by a branch-and-bound or implicit enumeration algorithm is approximately doubled if this variable is placed first in the tree rather than last. If there are unimportant variables in the highest level master problem, the performance of an implicit enumeration aldorithm could be appreciably degraded by the decomposi- tion approach. Another apparent disadvantage is the additional bookkeeping which must be done to keep track of the decomposition tree. The first disadvantage can be alleviated somewhat by adapting the flexible tree-search procedure described in Tuan [9] and Bravo, et al. [2]. This approach allows one to partially reorder a tree without re-searching branches already searched or discarding branches yet to be searched. With respect to the second disadvantage, the bookkeeping scheme is not significantly more complex than conventional ones. The additional restrictions on the branching scheme are that (a) we cannot branch on a variable i unless all variables in the master of the subproblem contain- ing i have been fixed; (b) we can backtrack any time the bound on the subproblem currently being solved is worse than the value of some feasible solution to the subproblem for the current setting of the variables in the master problem; or (c) we can backtrack any time that the overall bound is worse than the value of some known feasible solution to the entire problem. 474 L. SCHRAGE COMPUTATIONAL EVALUATION A computer program was written incorporating the decomposition method into a backtrack implicit enumeration scheme. The program was similar to those described by Ceoffrion |4], The revised simplex method of linear programming with explicit inverse was used to calculate bounds. After any variable was forced to or 1, dual pivots were performed to return to feasibility. After any variable in a backtrack step was released from or 1, primal pivots were performed to return to optimality. The rule for selecting the next variable to force to or 1 was simply to force to the nearer integer value that basic variable which was closest in value to an integer. The implementation was inefficient in at least two ways: 1) the LP portion worked with a full inverse at all times. That is, even though only a subproblem was being optimized, pivots were done in the full inverse. For the problems considered, each subproblem had half as many rows as the full problem, therefore each pivot in the implemented procedure may have taken as much as four times as much work as really necessary. 2) Natural integrality in the master problem was not taken advantage of. Before a subproblem can be searched, each variable in its master must be fixed to an integer value. If some variable Xj in the master was fixed to 1 and would have remained at the value 1, even if not constrained to the value 1, all through the enumeration of all subproblems, then it would not be neces- sary to examine Xj — in the backtrack step. Most integer programming routines take advantage of this natural integrality. The program here did not. A class of decomposable integer programs was derived based on a problem known as IBM— 1. A description of this problem can be found in Trauth and Woolsey [8]. The problem is a general integer program with seven variables and seven constraints. Two 0-1 variables were required to represent each of the original seven variables. A series of eight problems with the stairstep structure shown in Figure 2 was created. Each problem was composed of three blocks. Each of the two outer blocks con- sisted of a copy of the IBM-1 constraint matrix. In eight different problems, the middle linking block consisted of the first k columns of IBM-1, where k ranged from zero to seven. All problems had 14 rows. 100 WITHOUT DECOMPOSITION 1 J_ 1 2 3 4 5 NUMBER OF LINKING COLUMNS Fk.ure 4. DECOMPOSITION IN INTEGER PROGRAMMING 475 These problems were solved on an IBM 360/65. This machine uses multiprogramming; thus run times are random variables. The number of pivots is therefore perhaps a more reliable estimate of computational difficulty because most of the work is involved in pivoting. This statistic is plotted versus number of linking columns in Figure 4. As expected, the advantage of decomposition tends to diminish as the number of linking columns increases. The same program was used to solve the problems without decomposition; the program was simply not told that the problem was decomposable. Recall that under decomposition, each pivot should require less work than under no decomposition. Other computational statistics are displayed in Table 1. The link-edit time column is included only to give an indication of the variability in run times. The link-edit step required exactly the same amount of work in every run. Table 1. A Comparison of Decomposition with No Decomposition No. of linking columns Pivots Time (sec) Nodes examined Link-edit time (sec) With decomp. With- out With decomp. With- out With decomp. With- out With decomp. With- out 1 2 3 4 5 6 7 174 241 257 255 334 346 384 384 246 427 404 445 406 383 446 262 4.50 6.07 8.12 7.09 8.84 8.92 9.97 10.40 6.22 7.45 8.07 8.40 7.89 7.39 9.07 9.29 48 52 57 60 68 70 87 99 79 106 108 109 102 78 73 86 2.62 2.55 3.15 2.87 3.24 2.55 2.47 2.84 2.80 2.75 2.70 2.74 2.60 2.67 2.85 2.92 The run times under decomposition are perhaps longer than they need be in practice because a full solution report was printed each time a better integer solution was obtained to any subproblem. The runs without decomposition would typically produce three solution reports while a run under decomposition would typically produce, say, nine solution reports. A full solution report requires a fair amount of work because the inverse must be multiplied through the full matrix to calculate such things as reduced costs and dual prices. The decomposition method for these problems seems to be fairly robust in that the amount of work seems to increase less than linearly with number of Unking columns for this class of problems. For a small number of linking columns, decomposition is clearly superior. REFERENCES [1] Balas, E., "An Additive Algorithm for Solving Linear Programs with Zero-One Variables," Opera- tions Research 13, 517-546 (1965). [2] Bravo, A., J. G. Gomez, L. Lustosa, L. Schrage, and N. Pizzolato, "A Mixed Integer Programming Code," CMSBE Report No. 7043, University of Chicago (Sep. 1970). [3] Edmonds, J., "Paths, Trees, and Flowers," Canadian J. Math. 27, 449-467 ( 1965 ) . [4] Geoffrion, A. M., "An Improved Implicit Enumeration Approach for Integer Programming," Operations Research 7 7, 437-454 (May-June 1969). [5] Karp, R. M., "Reducibility Among Combinatorial Problems," presented at ORSA National Convention, New Orleans, La. (Apr. 26, 1972). 476 L- SCHRAGE [6] Salkin, H. M., "On the Merit of Generalized Origin and Restarts in Implicit Enumeration," Opera- tions Research 18, 549-555 (May-June 1970). [7] Spielberg, K., "Plant Location with Generalized Search Origin," Management Science 16, 165-178 (1969). [8] Trauth, C. A. and R. E. Woolsey, "Integer Linear Programming: A Study in Computational Efficiency," Management Science 15, 481-493 (May 1969). [9] Tuan, Nghiem Ph., "A Flexible Tree-Search Method for Integer Programming Problems," Opera- tions Research 19, 115-119 (Jan.-Feb. 1971). [10] Wagner, H. and T. M. Whitin, "Dynamic Version of the Economic Lot Size Model," Management Science Vol. 5, No. 1 (Oct. 1958). NUMERICAL TREATMENT OF A CLASS OF SEMI-INFINITE PROGRAMMING PROBLEMS* S. A. Custafson The Royal Institute of Technology Stockholm, Sweden and K. O. Kortanek Carnegie-Mellon University Pittsburgh, Pennsylvania ABSTRACT Many optimization problems occur in both theory and practice when one has to optimize an objective function while an infinite number of constraints must be satisfied. The aim of this paper is to describe methods of handling such problems numerically in an effective manner. We also indicate a number of applications. 1. INTRODUCTION In order to illustrate the subject of this paper, we immediately give a few examples on the class of problems we wish to study. EXAMPLE 1.1: One wants to determine a cumulative distribution function G which corresponds to a stochastic variable, which can assume values inside a finite closed interval [a, 6]. In N points tu *2, . . ., *-v, one has measured the values of G and obtained g\, g-z, . . .,g.w. We want to approximate G in [a, b] by a polynomial P of a degree less than a certain number n. It is natural to write r=l and require that P(a) — 0, P(b) = 1, and P' (t) 2* 0. Since we cannot hope that P passes through the measured points, we try to solve the problem N I n \ 2 inf W^ r <;-'- 6 , »1'»2 »nj=l X r=l / subject to 2 y r a r - 1 = r=l 2 (r-l)y r r- 2 3s0 a^t^b r=2 J y r br-i=l. *This research was supported in part by National Science Foundation Grant GK 31-833. 477 478 S. A. GUSTAFSON AND K. O. KORTANEK It is easily shown by examples that this problem may or may not be feasible, depending on the given data. This problem appeared when one wanted to study size distributions of grains in gravel deposits in order to get a suitable raw material for concrete production (Gustafson-Martna [27]). In the refer- enced paper, one did not attempt to solve exactly the problem indicated above, but used piecewise polynomial interpolation through the measured points instead, in order to meet the monotonicity condition. EXAMPLE 1.2: Bojanic-DeVore [4] discuss the problem of one-sided approximation of a given function from below, while maximizing a linear functional. Their problem can be stated as follows: Let [a, 6] be a closed interval, «i, u%, . . . , u„ n given functions which form a Cebysev system (for a definition see, e.g., Karlin-Studden [32] or Gustafson [20]). Let further <b be continuous on [a, b\ Determine n fb max 2. Vr I u r (t)dt, subject to ^yruAt) ^<f>(t), te[a,b). r=l Bojanic-DeVore give some unicity and existence results and identify the solutions with certain quad- rature rules. Methods for finding computational solutions to this problem are given in Gustafson [20]. EXAMPLE 1.3: Let again [a, b] be a closed interval, <a a positive function defined on [a, b] and 4> continuous on the same interval. We want to determine a polynomial of degree less than n which approximates <f> as well as possible in the weighted maximum norm determined by <u. That is, we want to solve the problem. Compute min tj subject to »(0I 2 y* r - l -4>(t)\*n U[a,b}. We can write this task in the equivalent form: Compute min tj subject to n a 2) yr' r -'<«>(f)-i7s£<M*)o(0 and -J y r t r - l a>(t) -tj =S -<ft (t)<o(t). T=\ r=l This problem is well known and for a>U) = l, a solution is constructed computationally by means of Remez' algorithms (see, e.g., Cheney [9]). For numerical purposes, an approximative solution often is satisfactory (Powell [41]), Gustafson-Dahlquist [23]). SEMI-INFINITE PROGRAMMING PROBLEMS 479 EXAMPLE 1.4: Kantorovich-Rubinshtein [31] and Rubinshtein [43] give examples of production scheduling problems, where an infinite number of linear constraints must be met. Also Vershik- Temel't [48] propose a process of finding a sequence of approximate finite linear programming prob- lems whose optimal values converge to the optimal value of the infinite linear programming problem. EXAMPLE 1.5: Gorr-Kortanek [18], Gustafson-Kortanek [24] and Gustafson-Kortanek [25] give examples of models for study of air pollution problems, where an infinite number of linear constraints must be fulfilled over a two-dimensional set S. Additional Examples and Problems As illustrated above, moment problems stem from problems in approximation and minimization, see Shohat-Tamarkin [45], Rivlin-Shapiro [42], Shapiro [441, Karlin-Studden [32] and others. This leads to applications of infinite programming techniques to analysis, Duffin [13], [14], Kretschmer [36], [37], Duffin-Karlovitz [15], including the development of orthogonality theorems and similar results with application to the theory of integral equations. While applications of the moment problem to statistics and probability theory are well-known, recent problems in these areas have been brought into contact with the theory of moments by Krafft [35]. Interesting applications of generalized moment problems have also been made in theoretical physics, see Baker-Gammel [1]. See also the classification theory of Ben Israel-Charnes-Kortanek [3| for linear programming problems over closed convex sets in locally convex spaces and applications to approximation theory. DEFINITION 1.1: We denote by problem D the general task: Compute »,,i/^..,i/„ G{y u y t , ■ ■ ■', y»h subject to (1.1) J y r u r {x) ss 4>(x) xeS. r=l Here S is a given set, «i, u-i, . . ., u„, <p are given functions defined over S. The objective function G is also given and must be defined for all vectors yi, y-i, . . ., y n which satisfy (1.1). We will refer to problem D as a semi-infinite program. The fact that n is finite is crucial in our analysis. We observe that all the examples, 1.1 through 1.5, are instances of problem/). Many well-known optimization problems are subsumed by problem D. If S has a finite number of elements, we arrive at mathematical programming tasks of various kinds. Note in particular that if G also is linear, we get linear programs. For a discussion of these problems, the reader is referred to the textbooks by Charnes-Cooper [6] and Dantzig [10]. We want, instead, to discuss the case when S has an infinite number of elements. The general theory of semiinfinite programming embracing such problems is given in several papers by Charnes, Cooper, and Kortanek [7], [8]. Included in their theory is the development of regularization techniques, analogous to those of finite linear programming, which we use in our computational developments. In section 2 of this paper we present the parts of the general theory which are relevant for our purpose. We discuss the intimate connection between problem D and certain so-called moment prob- 480 S. A. GUSTAFSON AND K. O. KORTANEK lems. We establish that the solutions of problem D can be found if one can solve a system of a finite number of scalar equations in a finite number of unknowns. This system is nonlinear even if G is linear and its numerical solution is a nontrivial task. In section 3 we propose an algorithm to be used in practical computational work. The basic idea underlying our algorithm is that D is approximated by a problem with a finite number of constraints. The solution hereby obtained is then used as an initial approximation which is then improved by Newton-Raphson iterations (other iterative methods might be considered). Thus the solution of semi-infinite programs can be achieved by combining well-known standard techniques. In particular problems, special short cuts can be used in order to facilitate the computations (see, e.g., Gustafson [20], [21]). We also discuss questions in connection with error estimation. We treat the problem of assessing how perturbations of input data influence the optimal solution and corresponding value of the objective function. 2. GENERAL RESULTS DEFINITION 2.1: Let K be the set of vectors y- (ri, y 2 , . • ., y«) which satisfy (1.1). We refer to K as the constraint set of D. We note that K always is contained in R", the rc-dimensional vector space (independent of the nature of S). If K is nonempty, it is convex. We give three simple instances of D in order to illustrate different situations that can occur. EXAMPLE 2.1: Find inf yi + y 2 , when y,x + y 2 * 2 ^l, xe[0, 1] (inconsistent). EXAMPLE 2.2: Find inf yi, when yi + y2*^Vx, xe[0, 1] (the inf.-value is 0, but it is not attained). EXAMPLE 2.3: Find inf yi + 1/2 y 2 , SEMI-INFINITE PROGRAMMING PROBLEMS 481 when yi + y*x^— -, xe[0, 1] 1 T X (The inf.-value is 3/4, and is assumed for y x = 1, y 2 = — 1/2). Another example is treated in lemma 3.5 in this paper. It is quite obvious that problems such as ex- amples 2.1 and 2.2 above will cause difficulties in actual machine computaton. We therefore want to specialize problem D, but in such a way that we still retain wide generality and consider only what we call regularized problems. (Compare Gustafson [20].) DEFINITION 2.2: We denote by problem D F a special case of problem D having the properties 1,2,3 below: 1. S can be written as Sk U Sf, where Sk is a compact subset of the A;-dimensional vector space (k < °°) and Sf has a finite number of elements. The conditions (2.1) 5>rM*)^<M*)> *cSf, r=l must be such that they alone restrict y to a bounded region K F of R". We require, however, that (2.1) is consistent. 2. We require further that U\, u 2 , . . ., u n and <j) are continuous over Sk and that U\, Uv, . . ., u„ meet Krein's condition: There exist constants C\ , c 2 , . . . , c„ such that n 2 c r u r (x) >0, xeSk- r=l 3. G must be differentiable and convex on K. We note that for a task of type Dp, K is a compact bounded set. Hence the minimum value Z is always assumed. Instead of Krein's condition, Gustafson [20] requires that «i, u 2 , . . ., u„ form a Cebysev sys- tem (is a unisolvent set). Unfortunately, this cannot be done if k> 1. We quote the following classical result from Buck [5] (the notations are slightly changed; C(ft) denotes the space of functions, con- tinuous over ft). LEMMA 2.1: If ft is a compact connected set and C(ft) contains a unisolvent linear subspace of finite dimension at least 2, then ft is homeomorphic to the unit interval or the unit circumference. Therefore, many results in this section will be generalizations of those in Gustafson-Kortanek-Rom [26]. In order not to get unnecessarily complicated formulae, we treat only the case that the inequalities corresponding to xcSf are of the type (2.2) below. We want first to treat the case n G(y U Y2, ■ ■ ., y»)= ^ V-rJr, r=\ and then generalize the results to a general Df-problem. Denote the problem by D Fo . 482 Compute S. A. GUSTAFSON AND K. O. KOKTANEK n Z = min V yr/J-r- subject to V y r ii r {x) 2* $(.*), xeS k and (2.2) Ft^yr^F* r=l, 2, . . ., n We want to show that the optimal solution of D hl) can be found by solving a nonlinear system of equa- tions. In order to derive this, we apply the theory of semi-infinite programming. For this purpose we need a few notations. Let 2 be the set of all finite regular measures which meet the integrability conditions \u,{x) | da(x) < °° r= 1, 2, . . . , n \4>{x)\da{x) <°o. Denote by M„ the convex cone in R": M„=l(T= (<ri, (72, • • ., o-„) |ov= I u r (x)da(x), r— 1, 2, . . ., n, cre 2 [• Compute inf £ YrH-r y-.y„ subject to V y r fi n (x) ^ <\){x) xeS k . Introduce now the problems P and D : Compute sup I 4>(x)da(x) a J S k subject to I Ur (x)da(x) = fJL r r=l, 2, . . .,n, J s k From Karlin-Studden [32, p. 472], we quote the result (the notations are slightly changed). LEMMA 2.2: Let /jl— (fJL U //, 2 , • • ., /a«) belong to the interior of M„. Then the optimal values of P» and D are equal. Arguing as in Gustafson [20], we can associate with D Fo the semi-infinite dual problem Pp : Compute subject to max J <\){x)dot(x) + JT {F x v? — F 2 vf), JS k r =i I u r (x)da(x) + v} — v~ = fj. r r=l,2, . . .,« Js k ae^ v+ 5* 0, v~ 3* r= 1, 2, . . ., n. SEMI-INFINITE PROGRAMMING PROBLEMS 483 Pf„ is always feasible, but may be unbounded. Exactly as in Gustafson [20] and Charnes-Cooper- Kortanek [8], we establish: LEMMA 2.3: Let Dp have interior points. Then P F and Dp are consistent and bounded. They assume their optimal values which are equal. We also obtain, following [8] , [20] : LEMMA 2.4: Among the optimal solutions of fV , there are such which correspond to point- masses with finite number of mass-points. Furthermore, LEMMA 2.5: Let D fo have interior points and let an optimal solution of Pf be given by i) a pointmass distribution with mass mi at jc', i= 1, 2, . . . , q, ii) q + of the v$ are positive, namely v?., v%„ . . ., t> + iii) q~ of the Vr are positive, namely v 7, , Vg 2 , . . ., v~ . Let y=(yi,y2, • • -,yn) be an optimal solution of D Fv Then the following equations are satisfied: (2.3) J miUr(x') + V$ — Vr= V-r T= 1,2, ...,«. (2.4) % yriiAx*) = <t>(x>) »=1,2 q. (2.5) y rj = F l >=1,2 q\ (2.6) y* k = F 2 A=l,2, . . .,<r. REMARK: We have also: q + q+ + q~ < n and m< > 0, i= 1, 2, . . ., q. The q column-vectors tti,u 2 , . . ., u q in (2.3) given by Ui = (ui (**)> u 2 (xi), . . ., u„{x,)) are linearly independent. The relations (2.3), (2.4), (2.5), and (2.6) are necessary conditions for finding optimal solutions. They can be supplemented by further conditions. Let «i, u>, . . .,«,, and </> have continuous partial derivatives of the first order. Define Q by n r= 1 Then relation (2.4) takes the form Q(x i ) = <f>(x i ) i=l,2, . . .,<?. If x belongs to S* we have Q(x)**4>(x). Let jc' be such that there is a nonzero vector h meeting the conditions: (2.7) x l + heS k , x'-hcS*. 484 S. A. GUSTAFSON AND K. O. KORTANEK Define \\t by $(t) = Q(x i + th)-$(x i + th) -Kt*sl. \\t has a continuous derivative with respect to t on [— 1 , 1] , «//(<) 3* and «/»(0) = 0. Hence «/»' (0) = 0. This observation can be utilized to derive further constraints on Q in the following manner. De- termine for each point x l a system of linearly independent vectors h which meet the requirements (2.7). Denote these vectors by hi, h 2 , . . . , hi t (if there is none put /, = 0). We have always li =£ k, the dimensionality of S*. We note in passing that li< k at boundary points only. Denote the directional derivative along hj by Dj. Then we must conclude (2.8) Dj(Q(xi)-(f>(xi))=6 ;=l,2,...,/« if/js*l. i=l, 2, . . ., q Equations (2.3), (2.4), (2.5), (2.6), and (2.8) form a nonlinear system and the optimal solution of the original problem can be found by solving it. The unknowns are the q masses mi, m^, . . ., m q the n scalars yi, y2, • • ., y«, the vectors x l , x 2 , . . . , x q , and the numbers vf, v£, . . . , v+, t;,~, v^ . . -,v~. We mention now two special cases: In the problems treated by Gustafson [20] and Gustafson-Rom- Kortanek [26], we have k=l. Hence lj = at boundary points, lj— 1 at interior points. If S* is strictly convex (e.g., a nondegenerate ellipsoid), then lj— at boundary points and lj = k in the interior. We next extend our results to nonlinear functions G. From Kortanek-Evans [34, p. 889], we get (after appropriate changes of notations): LEMMA 2.6: Let G be a continuously differentiable function defined on an open convex set W in R n . Consider the two problems: (/) (/ ) * min G(y) min y 7 '(VG)„=„ !(c when yeK when yeK where K is a closed convex set in W. Then y* is optimal for / if and only if y* is optimal for /* provided either one of the following conditions hold (a) G is pseudo-convex (b) G is quasi-convex and (VG) y=y * ^0. Using this lemma, we realize that if G meets conditions (a) and (b) of the lemma, we can replace D F by a linear problem with the objective function G* defined by C*(y,, y 2 , . . .,yu) = y. -T-*yr, where y* , y* , . . . , y * is an optimal solution of D F . In our nonlinear system (2.3), (2.4), (2.5), (2.6), SEMI-INFINITE PROGRAMMING PROBLEMS 485 (2.8), we should replace /x r by — * in (2.3) in order to allow for nonlinear objective functions. (The oy r remaining equations were derived independently of the objective function G.) DEFINITION 2.3: We denote by system NL the nonlinear system of equations obtained by COmbin- ing Equations (2.3) through (2.6) with (2.8). If G is nonlinear, fi r in (2.3) should be replaced by — * as dy described above. EXAMPLE 2.4: Compute min4yi + 2/3(y4+y«), when yi + xiy 2 + x 2 y3 + xfy 4 + Xix 2 y<i + x'iy 6 2s3+ [x\ — x 2 ) 2 (xi + x 2 ) 2 S 2 ={(*,,* 2 ) |*i|«l, i=l,2J. The associated moment problem P? reads compute max [3— (x, — x 2 ) 2 (xi + x 2 ) 2 ]da(xi, x 2 ), a Js 2 when I da(x\,x 2 ) = b s 2 I Xida(x\, x 2 ) =0 *2 Jx 2 da(:ci, x 2 ) = s 2 f *2<M*,,* 2 ) = 2/3 Js 2 Jaci« 2 rfa(aci, x 2 ) = s 2 I x|da(*i,x 2 )=2/3. J5 2 By inspection we find that Pf has the feasible solution with four mass points m X\ x 2 1 V6 V6 1 Vo" -Vo" 486 S. A. GUSTAFSON AND K. O. KORTANEK 1 -V6 Vo" 1 1 l 1 ~V6 "Vo" Djp has the feasible solution y } = 3, 72 — yz — ■ • • = y« = 0. The preference function assumes the value 12 in both problems, that is, we have found an optimal solution. We observe that da(x t , x 2 ) = d{%\, x 2 ) is another feasible solution of Pp and hence we have found the quadrature rule with positive weights: JL* < *-* )rf( *'* ) -*(4^) + *(^^) + *(4^) + *(^^) i The rule has positive weights and is exact if 4> is a polynomial of two variables and of degree less than 3. 3. COMPUTATIONAL SOLUTION 3.1. Definition of Acceptable Approximations In this section we give the general principles of a computational scheme for the solution of the problem Dp (Definition 2.2). DEFINITION 3.1: Let y be any vector. We define the nonoptimality AZ (y) as \Z-G{y) |, where Z is the optimal value of Dp. DEFINITION 3.2: Let again y be any vector. We define the discrepancy of y by 8(y)=mjn| JT y r u r (x) — <f>(x) \xeS k Thus y is an optimal solution vector if 8(y) 3* and G(y) = Z. Generally, one has to be content with trying to find a vector y such that 8(y) 5= — 8o and |C(y) — Z\ ^ €o, where the positive numbers 8o and c w are given tolerances. (Such a y is called an acceptable approximation.) We want to show that this can be done by means of a finite number of operations provided these are carried out with sufficiently good accuracy. 3.2 Cutting-plane Methods and Alternating Procedures Dp amounts to minimize a convex function over a compact convex set K. This is not specified in the form of a few simple equations. Instead, we know the supporting planes which are: n V y r u r (x) — <b(x) 3 s xeS. r= 1 One can, therefore contemplate using the principles of the cutting-plane method of Kelley [33] and Wolfe [51] with the accelerating device by Wolfe [51]. The first algorithm by Remez (see, e.g., Cheney [9], p. 96) can be considered as a variant of the cutting-plane method. We generalize this algorithm and define the following alternating procedure. (The word alternating refers to the fact that the optimal solution of Dp is computed by alternatively minimizing G over subsets of Kp containing K and minimizing certain functions over S. SEMI-INFINITE PROGRAMMING PROBLEMS 487 The general step is: let x l , x 2 , . . ., x sl (5 ^2) by given elements in S. Take y s as an optimal solution vector of the problem min G(y), subject to JT y r u,(* j )-<M* j ) ^0 ;=1,2,. . .,5-1. » = i Then define X s as an element in S which minimizes ]? y s r u r (x)-(f){x) xeS. r=l If this last minimum is nonnegative, the process is terminated. Otherwise we generate y* +1 , y* +2 , .... THEOREM 3.1: The sequence y s , y sfl , . . . generated by the alternating procedure above con- verges toward an optimal solution of Dr. PROOF: Since y s + 1 meets all the constraints of y s , G(y s + 1 ) 3= G(y s ). The same is true for any optimal vector of D F . Hence G(y*) =£ G(y s+1 ) *£ . . . =S d , where d is the optimal value. Three cases are conceivable, namely: CASE A: The alternating process stops after a finite number of iterations. CASEB: \im G(y s ) = d- -q, rj > 0. CASEC: Em G(y s )=d. If Case A occurs, the optimal vector has been reached because the last vector satisfies the constraints of K. We want to show that Case B is not possible. Since y J is an infinite sequence confined to the compact set Kf, it has accumulation points. Let y* be such a point. Put G(y*) = d — €, € > 0. Hence y* does not belong to K. n Let x be an element in S which minimizes ^ y*u r (x) — <f>(x). Denote the corresponding minimum r=l by A. We must have A <0, since y*4K. From the definition of y* we conclude (*) J yturW) -<M*0^0 7=1,2, .... r=l Let now {y lj } be a subsequence such that y'J—>y* and x l} tend toward an accumulation point x*. We find for each j: X y l Ju r (x)-<t>{x) ^ 2 y' r Ju r (x'J)-<t>(x'J). r=\ r=l Letting j — * 00 we arrive at A = § y?u r (x)-<t>(x) * 2 v*u r {x*)-<f>{x*). 488 S. A. GUSTAFSON AND K. O. KORTANEK But by (*) V y?u r (x*) -<f>(x*) ^ also, r=« since y'J — *y*, x lj —*x*. This contradicts that A < and hence Case C is not possible. Theorem 3.1 is therefore proven. The alternating method might be effective when an approximate solution vector of D F is known. If the objective function of D F is linear, one can use the algorithms given in Gustafson [20] and Gustafson-Kortanek-Rom [26]. As a matter of fact, if the objective function is convex, D F can be solved by solving a sequence of semi-infinite programs. We show now that the solution vector of Dp can be constructed as an accumulation point of the sequence y°, y 1 , . . . constructed recursively as follows. Let y° belong to K. See def. 2.1. When y°, y 1 , . . ., y _1 (/=l, 2, . . .) are determined, we define the linear functions "j(y)=G(yJ)+£ (y-y/)(f£) . Then we define y* as the optimal solution of the problem min 7T/_i (y), subject to tt,_, (y)^ ttj (y) 7=0,1, . . .,1-2 n 2 y r U r (x) 3=<M*) XCS K r=\ n 2) y r u r (x) ^<f>(x) xeS F . r=l From KeUey [33] we conclude that {y'} contains a subsequence converging toward an optimal solution vector of D F . The methods discussed here can be of use to construct an approximate solution of the system NL, which is then solved by means of Newton-Raphson which is rapidly converging. 3.3 Approximation with Problems with a Finite Number of Constraints 3.3.1 Generalities We first introduce: DEFINITION 3.3: A finite subset T of S K , T= {*', * 2 , . . ., x N } is called a grid. DEFINITION 3.4: We denote by problem D F - T the task: Compute minG(y), SEMI-INFINITE PROGRAMMING PROBLEMS 489 n Subject tO 2 yrUr(xi) 5S <M* J ) X J €r r=l n ^ yrM*-') ^ <f>(x) xeS F - r=l (Sf is the same set as in the definition of problem Dp.) This problem can be solved by standard tech- niques of mathematical programming. The solution of D F — T can be used to approximate that of D F . We shall now derive bounds for the discrepancy and nonoptimality expressed in grid data and char- acteristics of the functions U\, u 2 , . . ., u„, <f> and G. 3.3.2 Error Bounds for Optimal Solutions of D r — T We need two definitions DEFINITION 3.5: Let a grid T and a norm be given on S fc . The number |r| given by IT^maxmin ||* — Xj\\ xtSi, zjtT will be called the coarseness of T. This definition of |r| agrees with the concept of "density" in Cheney [9, p. 84], but it is not con- sonant with the definition in Gustafson [20, p. 350]. The latter could not be directly extended to multi- dimensional grids. Since S* is finite-dimensional, all norms are topologically equivalent and therefore any norm can be used. DEFINITION 3.6: Continuity modulus of a real-valued function i|». If \\t is continuous on S* we define the function a>$ as follows: Gty(z)=SUp |l//(*')-l//(x")|, subject to x'eS K , x'eSk \\x'-x"\\^z. Gty is called the modulus of continuity of i//. aty is nonnegative and increasing. Since t// is continuous lim (t)^(z) =0 when z— » + 0. (Compare Cheney [9, p. 86].) We can now prove: LEMMA 3.1: Let y T be an optimal solution of D F — T. Then (3.1) 8(y T ) ^-A T (|r|), where (3.2) A,(|71) = X |y r T ku r (|7l) + a>*(|r|). r=l PROOF: (The arguments are a slight generalization of those in Gustafson [20, pp. 351-352].) Put n 490 S. A. GUSTAFSON AND K. O. KORTANEK Let xeS k . We want to get a lower bound for <//(*)• By the definition of |T| there is a gridpoint *•>' such that ||« — *>'|| *£ \T\. We write i//(*) = <M* J ') + <//(*) - «M* j ), giving <//(*) ^4j(xJ)-\^(x)-iI,(xJ)\. Hence Mx) &4i(x J )-v*(\\x-x i \\) ^-a>U\T\) since </>(* j ) ^ 0> II* - * j || ** \T\ and o\|, is positive and increasing. We find immediately <o*(\T\) ^Ardri). Q.E.D. This result can be strengthened to LEMMA 3.2: Let y T be an optimal solution of D F — T. Then (3.3) 8(y r ) ^-l±{\T\), where (3.4) A(|r|) = Gi*(|7|)+f ^oj Ur (\r\), a ndF=max{\F 1 \,\F 2 \}. r=\ PROOF: Use the fact that |y r | =£ F. The bound in Lemma 3.2 is more conservative than Lemma 3.1, but it is a priori in the sense that we do not need to know y T in order to evaluate it. Hence we can tell in advance how small |T| must be selected in order to get a discrepancy below a given tolerance. We next derive bounds for nonoptimality. LEMMA 3.3: Let y T be an optimal solution of D F — T. Let n (3.5) H = 2 C r U r r=\ be positive* over Sk and put (3.6) y= min H{x). xtS k Put (3-7) y» = y?+y-'c r A 7 .(|r|), r=l,2, ..!,», where A 7 is defined by (3.2). Then 1 (3.8) where y is an optimal solution of D F . l(G(yT) + G(y"))-G(y) 2 (G(y»)-G{y T )). *The existence of H is guaranteed by Krein's condition. SEMI INFINITE PROGRAMMING PROBLEMS 491 PROOF: Put <?"=2yr"«r. We want to show that Q H {x) 2* <f>(x), xtSk. We write giving Thus Q H (x)-^(x)=Q H (x)-Q T (x)+Q T (x)-^x), Q»ix)-4>{x)=y-*H(x)*T(\T\)+Q T (x)-<Hx). Q«(x)-4>{x)&A T {\T\)(y- l H{x) -l)z*0. Hence y" is a feasible solution of D t and therefore G(y»)^G(y). On the other hand, y T is the optimal solution of D r — T and we can therefore conclude G{y»)>G(y) ^G(y T ), from which (3.8) follows. Q.E.D. Lemma 3.3 can be used to derive a posteriori bounds for nonoptimality. We can namely prove Lemma 3.4. LEMMA 3.4: Let y r be an optimal solution of Dp — T. Then we can replace Ar in (3.7) by A in (3.4) and (3.8) gives an a priori bound. Further, if v is such that dG dy r r—1, 2, . . ., n everywhere, then 7i(G(y T )+G{y»))-G(y) ±M\T\)±\c r V r \. Often bounds of the partial derivatives of 14, . . ., u„ and <j> are known. Then the expressions for A and Ar can be simplified. Let x' and x' + h belong to S. Then the mean-value theorem gives |<M*' + M-<M*')|=£(f^r) w where £= x' + Bh for some number in (0, 1). Put (3.9) K*=SUp SUP J) J-'V h d<f> dx r 492 Hence S. A. GUSTAFSON AND K. O. KORTANEK |<M*' + fc)-<M*')l«l|A||K*, giving oi*(/i)« ||A||k* Defining *c» r in the same manner as k>, we obtain (3.10) (3.11) A r (0^*A; (SO, A(f)^*A' (SO where A' r =K,»+ £ |yr|K U) A' = k* + f£ K Ur ,andF = max{|F,|,|F 2 |}. If we replace A 7 and A with the bounds (3.10) and (3.11) and revise the arguments in the preceding four lemmas, we arrive at Theorem 3.2. THEOREM 3.2: If «i, u 2 , . . . , u n and <f> have bounded partial derivatives of the first order and y T is an optimal solution of Dp — T, one can give explicit a priori bounds on the nonoptimality and discrepancy of y T . These bounds are proportional to \T\. A further refinement is possible if Mi, u-i, . . ., u n and <f> have continuous partial derivatives of the second order. Let y r be an optimal solution of D F — T. Put n * = 2 yrUr ~ $• Then «J/(* J ) 5* 0, for all x j eT. A lower bound of </»(*), xeSfc can be constructed from the following result. LEMMA 3.5: Let x\ x 2 , . . ., x k+1 be k + 1 given points in S* and h a number such that < |*?-xfl *£ h i=l,2,...,A; r=l,2, . . ., &+1 and the determinant X 1 1 xf^ r k+l r k+l #0. SEMI-INFINITE PROGRAMMING PROBLEMS 493 Take a fixed point x in the convex hull U of x l , x 2 , . . ., x k+l . Put R(x) = sup/(*), when/varies over all functions with two continuous partial derivatives of the second order such that (3.12) dXidXj * 2ch xeU, where Cy i=l, 2, . . ., k; j—1, 2, . . ., k are given constants and /meets the condition (3.13) then (3.14) /(*>) = ;-l,2 f . . .,A+1, i=i j=i where Xi, X 2 , . . ., A*+i are determined by fc+i *= £ A r * r . r=l REMARK. To determine /?(*) over a fixed x is an instance of problem D in section 1 when S is infinite-dimensional. S is namely the space of all functions of k variables which have continuous partial derivatives of the second order. PROOF: Without loss of generality, we can assume that the coordinates are chosen such that x l = 0. Taylor's formula (expanding about x x ) gives, since /(*') = 0: k k where f(x r ) = ^ajx } +^^bijXiXj, r=2,3, j=i i-ij-i 1 \dXj)x=xi br.=l(J^L\ 0***1 u 2 \dxidXj/ x=e r xi+u-e )x r r .,*+!. Hence we arrive at the problem: Compute subject to (3.15) k k L = sup ^ ajXj+ £ 2 btjXiXj, j=i <=ij=i k k 2°*5+2 SVW 8 * r=2 ' 3 k+1 j=l i=lj=l IM * Cy, |6 v |*Cy. 494 (3.15) gives S. A. GUSTAFSON AND K. O. KORTANEK Since x is in the convex hull of {x r } r— 1, 2, . . ., A + 1, we can write *=£ X r * r where £ A r =l, \ r ^ 0, r= 1, 2, ...,&+ 1, r= l r=l giving Thus we are left with the task Compute subject to k k+l k k j=l r=2 i=l j=l k k I k+l \ L= 9 up 2 2 l*« x * x j~ 2 x »-^n- i = 1 j = 1 ^ r=\ ' l^jl^Cij. Hence we should take bij = ctj sign f xixj — ]£ k r x[xjj Entering | bij | = \bL \ = Cy, we get the bound sought. The determinant condition implies that Ai, A2, Xfc+i are uniquely determined by x and x 1 , x 2 , . . ., ** +1 , and hence the lemma is proven. If we now make the substitutions x r =h£ r , x = h% in (3.14), we get R(X) ^ A 2 t t (dj ££ + Y krttfr \ i=\ j=l V r=\ I where fc+i ^=S^ If now the bounds on the derivatives hold uniformly over S*, we get Hy T ) z*-e-\T\\ where e is determined by the distribution of the grid points T and the bounds of the second-order partial derivatives on S*. Hence, if we consider a sequence 7\, T 2 , . . .of hypercubic grids the bound on |8(y r, ')| decreases as the square of 1 7\ |, i = 1, 2, . . .. Revising the arguments leading to Theorem 3.2 above, we find that the same holds true for the bound on the nonoptimality. SEMI-INFINITE PROGRAMMING PROBLEMS 495 3.3.3 Convergence Results when |T| — »0 LEMMA 3.6: To every e > there is an h > such that if \T\ < h and y T is an optimal solution of Df—T, there is ay which is an optimal solution of Dp and satisfies || y — y T \\ < e. PROOF: The same arguments apply as in Gustafson [20, Theorem 3.3]. This result can be both generalized and sharpened. If Dp — T has an optimal solution y r , we can associate with it the problem Pp — T given below. DEFINITION 3.7: Let the grid T be {x\ x 2 , . . ., x N }. We denote by problem P F -T the task: N n Compute max ]? mj^ix 3 ) + ^ {Fit* — F 2 v~ ) , j = 1 r = 1 N fdG\ subject to V mjU r (x j ) + vt — v~ = [- — ntj^ v? 5= v~ ^ 0. Pp — T is a linear program (even if G is not linear) and has hence an optimal solution which corresponds to a point mass distribution with, at most, n mass-points. Select such a solution of Pp — T with the mini- mum number of mass-points. Then with each T we can associate the vector z(T) given by (3.16) z{T)={i(T),tHT), . . .,t»(T),y(T), where £ J are the mass-carrying points. An optimal solution of Pp — T and Df — T is uniquely determined if z{T) is given, since the vectors ^(T), . . ., g v (T) are linearly independent and v+, v~ enter the solution it y r = Fi or F2, respectively. Let || I be a norm on Sk- We define \\ z\\ by 11*11 = 2 llfll+i \yr\- THEOREM 3.3: Let 7} j=l, 2, ... be a sequence of grids such that | 7j | -> when y->oo. With each Tj we associate the vector z{Tj) defined analogously with z(T) above. Then we can find a subsequence z(Tj t ) converging towards a vector 2 which describes a solution of Pp—Dp. PROOF: Since v^ n,z(T) has at most n(k+ 1) components. S* is a compact set andFi ^ y r ^ F 2 . Hence we can find a number B such that || z(Tj) \\^B. Therefore {z(Tj)} is confined to a bounded subset of a finite-dimensional Banach space. Thus {z(Tj)} is contained in a compact set. We first select a subsequence Tj,, 7j 2 , Tj such that y{Tj ) converges towards a vector y. Using the same arguments as in Theorem 3.3 in Gustafson [20] , we establish that y is an optimal solution of Dp. To each y(Tj s ) there is an optimal solution of P—Tj s . We want to define vectors z(Tj s ) according to (3.16) which we do recursively as follows: Let z(Tj g ) be given and of the form z(T jg ) = V(Tj),{HTj), . . ., ?(T 3 J, y(Tj). 496 S. A. GUSTAFSON AND K. O. KORTANEK Let Pf — Tj g+l have an optimal solution with the mass-carrying points u l , u 2 , . . ., u". We now define £ l (Tj - ) equal to a vector from the set {«', u 2 , . . ., u"} which minimizes ii* , ai,)-»-ii, when 1 ^ a =£ v. This is done for /= 1, 2, . . ., min (v, v). There are three cases (a) v < v. Then the vectors in z(Tj s ) which have not been matched are put in the vector for (b) v=v. No subsequent change in the definition of z(7j s+1 ) is done. (c) v > v. The vectors from {u 1 , u 2 , . . ., v v } which have not been matched are transferred to z(Tj s+l ). Hence, in all cases max (v, v) vectors are put in the vector for z(Tj t ) which also contains In the manner described above, we define recursively z(Tj t ), z(Tj 2 ), .... In no case will any of these vectors contain more than n points. We can now take a subsequence which converges towards an accumulation point which we call z* from which a solution of Dp — T and its corresponding point-masses can be constructed. REMARK: From the construction of z* it is obvious that certain of the points represented in the vector z* carry the mass zero. They can hence be removed. Other points can be confluent. If this is the case, only one member from every group of confluent points is carried. The last mentioned case is common. Compare Gustafson [20] and subsection 3.3.5. The idea to approximate a semi-infinite program with an optimization problem with a finite number of constraints is, of course, not new. Convergence results of the same character can be found, e.g., in Vershik-Temel't [48] and Cheney [9, pp. 86-88]. 3.3.4 Special Devices to Economize and Stabilize the Simplex Method In this subsection we consider the case when G is linear, that is n G(y) = 2 y r p. r . (As noted in section 3.2, every convex semi-infinite program can be solved by solving a sequence of linear semi-infinite programs.) In this case, Dp — T is a linear program and an optimal vector y T can be constructed with the Simplex method. Let now y be a candidate for y T and put n (3.17) «/»(,= ^ y r u r -<b. r=l We want to investigate if *}fy(xJ) 5* 0, x^T. Let x'eT be such that tyy(x') > 0. For all x in S* we find the bound <M*) > *„(*')-«*,,( 1 1*-*'| |). SEMI-INFINITE PROGRAMMING PROBLEMS 497 We note that n <«%(*) ^ X ly»-|G)tt r («) + o>^ ( „), t > 0. r=l If we put Si={*|ft^(||»-^||)^ *(*»)}, we can conclude that xeSi implies tyy{x) ^ 0. Thus in particular, if xh T is in Si the corresponding column need not be even generated in order to establish that <//(^-') 3= 0. Another problem is that numerical difficulties can be anticipated when l^l is small, due to the fact that the Simplex algorithm calls for the solution of a linear system, whose matrix of coefficients might be nearly singular. Each Simplex iteration consists of two stages: A. Determination of a candidate vector y by solution of the system (3.18) J y r ur{xi) = 4>(x*) j= 1, 2, . . ., n, r=\ where x J belongs to T and corresponding rows are linearly independent. Then we have to find the sign of il)y(x') for all x'eT, where i// tf is defined by (3.17). B. If min iM*'h x'eT, is assumed for i = ii, we introduce x" in the next basis and hence we have to solve systems of the form (3.19) J ^iUrix 1 ) = ix r r=l,2, . . .,n m t - > 0. As remarked in Gustafson [20], the abscissae often lie close together in pairs which will cause the matrix of coefficients in (3.18) and (3.19) to be ill-conditioned. Consider first the problem of determining </>«(*) from (3.18). Let y be the computed value of y and put €i = £?rM*>)-<M* J ) ;=1, 2, . . ., n. Let further Aifj y (x) be the error in the value of ty y {x) caused by the fact that we use y instead of y. Gustafson [22] gives the bound where n ^ Pj(x)u r (xJ) : =Ur(x) T=l,2, . . .,rt. j=l Put P(*)=X |pr(*)|. 498 S. A. GUSTAFSON AND K. O. KORTANEK We note then that p(x) is a continuous function over Sk and that p(xj) =1,7=1,2,. . . , n. Further, p is completely determined by the system u u u 2 , . . ., u„ and hence independent on the way we perform our computations. e r is, however, dependent both on the computer and the manner in which (3.18) is solved. Let £ r be the value we should obtain if we inserted the exact vector y in (3.18) and evaluated the residuals computationally. Wilkinson [50, p. 252] states that if (3.18) is solved by means of Gaussian elimination with pivoting, then ||e|| =£ 3||e|| even if the system is ill-conditioned. Therefore, if we solve (3.18) in this mode, |Ai//y(;t)| is as small as possible. However, the ordinary Simplex method does not provide pivoting when the sequence of linear systems is solved, a fact that may be the cause of the often- observed instability of linear programming codes. In contrast, the variant of Bartels-Golub-Saunders [52] holds promise to be more stable. When we solve (3.19), it is crucial that the sign of m, remains positive. 3.3.5. Construction of an Initial Approximation for N ewton-Raphson Method We discuss now how to construct an initial approximation for system NL (Definition 2.3), when the solution of Di — T and Pf — T is known. In this subsection we make the following general assumptions: Al: u\, u 2 , . . ., u„ and (f> have continuous partial derivatives of the second order. A2: G is linear (3.20) G(y) = ^p. r y r . (If G has continuous partial derivatives, we put (3.21) C(y)~C(y*) + l ( yr _y*)(|£^ , where y* is a solution of D F — T.) A3: The matrix A y {x) given by ««> W%g is positive definite when x is a zero of i// y , and \\f ti is defined by (3.17). REMARK: The linearization (3.21) is used when we employ the iterative process described in the end of subsection 3.2. Assumption 3 entails that the zeroes correspond to strict minima of t// y . A3 is difficult to verify in advance. The major problem in finding an approximate solution is to determine the number of mass-points in an optimal solution of D F when the solution of Dp — T and its primal Pf~T are known. Let y T be an optimal solution of D F — T and an optimal solution of Py — T be described by the pairs (3.23) (£', m'), i=l,2, . . ., n. *Research Report, Department of Computer Sciences, Standford, 1969. SEMI-INFINITE PROGRAMMING PROBLEMS 499 The vector £' gives the location of the mass m { . P F — T may have many optimal solutions, but we can always take one with n' ^ n. DEFINITION 3.8: A subset Cj of {£'}£:, is called a cluster if each member of Cj lies at most 3 17*| from any other member of Cj, and Cj cannot be expanded by inclusion of more elements of {£'}"' ,. Thus we divide the set (3.23) uniquely in q clusters where 1 *£ q^n'. DEFINITION 3.9: Put n *T = 5) yr U r~<f> r= 1 and define the matrix At analogously with A u in (3.22). Ci is called a point-group if A T (x j ) is positive definite and ||V«M*J)|| *s 0.5||/M*;)||, all xUC\ Two cases are possible, namely: case a: all clusters are point-groups; case b: there is a cluster that is not a point-group. LEMMA 3.7: Let A\,A2, A3 hold. Then there is a number h' such that if \T\ < h' then all clusters are point-groups. PROOF: Assume the contrary. Using Theorem 3.3, we can then select a sequence 7*i, T-z, ... of grids such that \Ti\ — * and such that corresponding vectors z(Ti) tend to a vector z while at least one of the clusters of z(Ti) is not a point-group. Let the clusters be Ci(l), C 2 (l), . . ., C q (i)(l). Each of the clusters contains at most n elements and hence their diameter is less than 3n | Ti \. Hence all the mass-carrying points in a cluster converge toward the same point. Denote these limit-points by £,-, j= 1, 2, . . ., q. Using Assumption A3, we conclude that there is a 8 > such that if |U~ £j|| < 8», y'=l, 2, . . ., q then \\A(x) \\ 5* 4|| Vi/»(x) ||- The convergence of y\, yg, . . ., implies that there is an Ni such that ||^r,(A:)|| 2 s 2|| Vi/»7-,||, l> Ni for \\x— ![\\ < 8o. In the same manner we establish that there is an Ni such that l> N 2 implies A T ,(x) is positive definite. Therefore, if / > max (N, N t , N 2 ) then Cj is a point-group, j— 1,2,. . ., q. Hence the sought contradiction is established. If T is such that the set (3.23) can be subdivided in clusters, all of which are point-groups, we construct an initial approximation to system NL as follows: 1. The masses in each point-group are combined and allocated in the group's center of gravity. Hence we take q equal to the number of point-groups and each point-group corresponds to a mass-point. 2. If a mass-point is less than 3 J T"| from the boundary, it is moved to the nearest boundary-point. 3. This so obtained point-mass distribution is taken as first approximation. 4. Equations (2.4) and (2.8) are determined by the distribution of mass-points. 3.3.6 Use of Newton-Rophson Methods Applying the methods described in the preceding section, we obtain an approximate solution of the system NL. A solution of this system can be described by a vector z of the general structure (3.24) z=(nii, x 1 , m 2 , x 2 , . . ., m q , x Q , v + , v~, y), where the point x* carries the mass nii and y is an optimal solution of D F . 500 S. A. GUSTAFSON AND K. O. KORTANEK DEFINITION 3.10: Any vector of the general structure (3.24) will be called a trial vector if /»j>0, i=l,2, . . ., q,vf& 0, r=l, 2, . . ., n and vf^ 0, r=l, 2, . . ., rc. LEMMA 3.8: Any solution of (2.3)-(2.6), (2.8) which is a trial vector, can be used to give a lower bound of the optimal value of Dp. If also n 2 y r u r (x) 3* <j>(x), *«£*, r=l then y is an optimal solution of Dp. PROOF: A trial vector which is a solution describes a feasible solution of Pf — T for a certain T. The conclusions follow, then, from known duality relations. Q.E.D. We want to generate a sequence of trial vectors which converge toward an optimal solution. We write the system NL under the general form W(z) = 0. Assume a trial vector z-* is known. Then we want to construct a correction hi such that (3.25) \\w(zi+h*)\\< lirG*)||, and then put (3.26) zi +1 = Z i + hi, if hi can be selected such that z j -+- hi is a trial vector. In the classical Newton-Raphson method, we take hi as the solution b j of the system. N ZW( 7 i\ (3.27) y^-^fri+r(zJ) = where N is the number of components of z. If the matrix in (3.27) is singular, the methods in Ben- Israel [2] can be used. In Ortega-Rheinboldt [40, p. 421], we find general criteria for the convergence of the Newton- Raphson's method, but they cannot, in general, be used. However, if the matrix in (3.27) is regular for all z } and the condition (3.25) is met, then {zj} converges toward a local minimum of the function ||JT(z)||. The same is true for the modified sequence {zi} obtained by putting hi = \-bi where the real number \ is such that condition (3.25) is met together with the requirement that z j+1 is a trial vector. The general idea is to generate a sequence of trial vectors until the norm of the residual falls below a value, prescribed in advance. If this does not happen before a given maximum time has elapsed, the process is assumed to diverge. Then one can use the last trial vector found to construct better approxi- mations by means of the grid point methods described in earlier sections. We note that Dp subsumes a large class of different problems. In many important particular instances, special methods can be used both to simpUfy the computational scheme and to establish properties of convergence and unicity. We will return to this in later papers. SEMI-INFINITE PROGRAMMING PROBLEMS 501 3.3.7 Remarks on Sensivity It is well known from the numerical solution of special cases of Df (see, e.g., Example 1.3 in the introduction) that small changes in input data cause large dislocations of x' and mi, but thai the optimal value is not affected very much. x x are the locations of minima of the function £ y r u r - <f>, r=l and to determine these is an ill-conditioned task in one dimension (see, e.g., Wilkinson [49, p. 39]) and this situation cannot be expected to improve when St has several dimensions. A first order a posteriori approximation of the sensivity can be made if the matrix of (3.27) is regular. Let/(z) — G(z) and let z be an approximation of a solution z*. Linearizing, we arrive at df=f(z*)-f(J) ~<V/, b>, where Mb = - W{1) with „„ dWi(z) , , * . M Xi = and < a, b > denotes > a r b r . Hence we find ||6||= \\M~W{z)\\ ^ \\M^\\ \\W{z) || and we get the approximation \\z~z*\\ ~ ||6||. The estimate of df can be written in an attractive manner. Using Lemma 2.1 in Gustaison [22J, we get e(/= — < u, w > where M T u=S7f. Hence an approximate bound on \df\ is given by | < u, w>\. This concludes our treatment of the computational scheme. Recently (1973) a definition of cluster is taken independently of grid size. The convergence theorems are then different than lemma 3.7. BIBLIOGRAPHY [1] Baker, George A., Jr. and John L. Gammel, "Applications of the Principle of the Minimum Maxi- mum Modulus to Generalized Moment Problems and Some Remarks on Quantum Field Theory," J. Math. Anal, and Appl. 33, 197-211 (1971). [2] Ben-Israel, A., "A Newton-Raphson Method for the Solution of Systems of Equations," J. Math. Anal, and Appl. 15, 243-252 (1966). [3] Ben-Israel, A., A. Charnes, and K. O. Kortanek, "Asymptotic Duality Over Closed Convex Sets," J. Math. Anal, and Appl. 35 (1971). [4] Bojanic, R. and R. DeVore, "On Polynomials of Best One-Sided Approximation," L'Ensignement Math. 12, 139-164 (1966). [5] Buck, R. C, "Alternation Theorems for Functions of Several Variables," J. Approx. Theory 1, 325-334 (1968). [6] Charnes, A. and W. W. Cooper, Management models and industrial applications of linear program- ming (J. Wiley and Sons, New York: 1961) Vols. I and II. 502 s A GUSTAFSON AND K. O. KORTANEK [7] Charnes, A., W. W. Cooper, and K. O. Kortanek, "Duality, Haar Programs and Finite Sequence Spaces," Proc. Nat. Acad. Sci. U.S., 48, 783-786(1962). [8] Charnes, A., W. W. Cooper, and K. O. Kortanek, "On the Theory of Semi-Infinite Programming and a Generalization of the Kuhn-Tucker Saddle Point Theorem for Arbitrary Convex Functions," NRLQ 16, 41-51 (1969). [9] Cheney, W. E., Introduction to approximation theory (McGraw-Hill, Inc., N.Y., 1966). [10] Dantzig, G. B., Linear programming and extensions (Princeton University Press, N J., 1963). [11] DeVore, R., "One Sided Approximations of Functions," J. Approx. Theory I, 11-25 (1968). [12] Duffin, K. J., "Infinite Programs," in Linear Inequalities and Related Systems (ed. by H. W. Kuhn and A. W. Tucker), Annals of Math. Studies No. 38, Princeton University Press, Princeton, N.J., pp. 157-170(1956). [13] Duffin, R. J., "An Orthogonality Theorem of Dines Related to Moment Problems and Linear programming," J. Combinatorial Theory 2, 1-26 (1967). [14] Duffin, R. J., "Duality Inequalities of Mathematics and Science," 401-423 in {39|. [15] Duffin, R. J. and L. A. Karlovitz, "Formulation of Linear Programs in Analysis I: Approximation Theory," SIAM Jour. 16, 662-675 (1968). [16J Fan, Ky, "On Systems of Linear Inequalities," in Linear Inequalities and Related Systems (ed. by H. W. Kuhn and A. W. Tucker), Annals of Math. Studies No. 38, Princeton University Press, Princeton, N.J. (1956), pp. 99-156. [17] Fan, Ky, "Asymptotic Cones and Duality of Linear Relations," J. Approx. Theory 2, 152-159 (1969). [18] Gorr, W., and K. O. Kortanek, "Numerical Aspects of Pollution Abatement Problems: Constrained Generalized Moment Techniques," IPP Report No. 12, School of Urban and Public Affairs, Carnegie-Mellon University (Oct. 1970). [19] Gorr, W., S.-A. Gustafson, and K. O. Kortanek, "Optimal Control Strategies for Air Quality Standards and Regulatory Policy," Environment and Planning 4, 183-192, (1972). [20] Gustafson, S.-A., "On the Computational Solution of a Class of Generalized Moment Problems," SIAM J. Numer. Analysis 7, 343-357 (1970). [21] Gustafson, S.-A., "Numerical Aspects of the Moment Problem," Fil.dr. Thesis, Institutionen for Informations Behandling, Stockholms Universitet, Stockholm, Sweden (Apr. 1970). [22] Gustafson, S.-A., "Control and Estimation of Computational Errors in the Evaluation on Interpola- tion Formulae and Quadrature Rules," Math. Computation 24, 847-854 (1970). [23] Gustafson, S.-A. and Germund Dahlquist, "On the Computation of Slowly Convergent Fourier Integrals," Methoden und Verfahren der Mathematischen Physik 6, 37-43 (1972). [24] Gustafson, S.-A. and K. O. Kortanek, "Analytical Properties of Some Multiple-Source Urban Diffusion Models," Environment and Planning, 4, 31-41, (1972). [25] Gustafson, S.-A. and K. O. Kortanek, "Mathematical Models for Air Pollution Control: Numerical Determination of optimizing Abatement Policies" to appear in Models for Environmental Pollution Control (R. A. Deininger, Ed.), Ann Arbor Science Press, Ann Arbor, Mich. [26] Gustafson, S.-A., K. O. Kortanek, and W. Rom, "Non-Cebysevian Moment Problems," SIAM J. Numer. Analysis 7,335-342 (1970). [27] Gustafson, S.-A. and J. Martna, "Numerical Treatment of Size Frequency Distributions with Computer Machine," Geologiska Foreningens Forhandlingar 84, 372-389 (1962). [28] Gustafson, S.-A. and W. Rom, "Applications of Semi-Infinite Programming to the Computa- SEMI-INFINITE PROGRAMMING PROBLEMS 503 tional Solution of Approximation Problems," Tech. Report No. 88, Dept. of Operations Re- search, Cornell University, Ithaca, N.Y. (Sept. 1969). [29J Haar, A., "Uber lineare ungleiehungen," Acta. Math. (Szeged) 2, 1-14 (1924). [30] John, Fritz, "Extremum Problems with Inequalities as Side Conditions, in: Studies and essays, Courant Anniversary Vol. (ed. K. O. Friedrichs, O. E. Neugebauer, and J. J. Stoker) J. Wiley and Sons, Inc., New York, pp. 187-204 (1948). [31] Kantorovich, L. V. and G. Sh. Rubinshtein, "Concerning a Functional Space and Some Extremum Problems," Dokl. Akad. Nauk. SSSR 115, 1058-1061 (1957). [32] Karlin, S. and W. J. Studden. Tchebycheff Systems: with Applications in Analysis and Statistics Interscience Publishers, J. Wiley and Sons, Inc., New York, (1966). [33] Kelley, J. E., Jr., "The Cutting Plane Method for Solving Convex Programs," J. SIAM 8, 703-712 (1960). [34] Kortanek, K. O. and J. P. Evans, "Pseudo-Concave Programming and Lagrange Regularity," Operations Research 75, 882-891 (1967). [35] Krafft, Olaf, "Programming Methods in Statistics and Probability Theory," 425-446 in [39]. [36] Kretschmer, K. S., "Programmes in Paired Spaces," Can. J. Math. 13, 221-238 (1961). [37] Kretschmer, K. S., "Linear Programming in Locally Convex Spaces and Its Use in Analysis," Ph.D. Thesis, Carnegie-Mellon University, Pittsburgh, Pa. (1958). [38] Meinardus, Giinter, Approximations of functions: theory and numerical methods (Springer- Verlag, New York, Inc., 1967). [39] Nonlinear programming (ed. J. B. Rosen, O. L. Mangasarian, and K. Ritter) (Academic Press, New York, 1970). [40] Ortega, J. M. and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables (Academic Press, New York and London, 1970). [41] Powell, M. J. D., "On the Maximum Errors of Polynomial Approximation Defined by Interpola- tion and by Least Square Criteria," Comp J. 9, 404-407 (1966). [42] Rivlin, T. J. and H. S. Shapiro, "A unified Approach to Certain Problems of Approximation and Minimization," SIAM J. Appl. Math. 9, 670-699 (1961). [43] Rubinshtein, G. Sh., "Investigations on Dual Extremal Problems," Doctoral Dissertation, Inst. Matem. SO AN SSSR, Novosibirsk (1965). [44] Shapiro, H. S., "On a Class of Extremal Problems for Polynomials in the Unit Circle," Portugaliae Math. 20, 67-93 (1961). [45] Shohat, J. A. and J. D. Tamarkin, "The Problem of Moments," Mathematical Surveys, No. 1, Am. Math. Soc, New York (1943). [46] Stiefel, E., "Note on Jordan Elimination, Linear Programming and Tchebycheff Approximation," Numer. Math. 2, 1-17 (1960). [47] Todd, J., A survey of numerical analysis (McGraw-Hill, New York, 1962). [48] Vershik, A. M. and V. TemePt, "Some Questions Concerning the Approximation of the Optimal Value of Infinite-Dimensional Problems in Linear Programming," Sibirskii Matematicheskii Zhurnal 9, 591-601 (1968). [49] Wilkinson, J. H., Rounding Errors in Algebraic Processes (Prentice-Hall, Inc., Englewood Cliffs, N.J., 1963). [50] Wilkinson, J. H., The Algebraic Eigenvalue Problem (Clarendon Press, Oxford, 1965). 504 S. A. GUSTAFSON AND K. O. KORTANEK [51] Wolfe, Philip, "Accelerating the Cutting Plane Method for Nonlinear Programming," J. Soc. Indust. Appl. Math. 9, 481-488 (1961). [52] Bartels, R. H., G. H. Golub, and M. A. Saunders "Numerical Techniques in Mathematical Pro- gramming" in Nonlinear Programming (ed. by J. B. Rosen, O. L. Mangasavian and K. Ritter), Academic Press, New York, pp. 123-176 (1970). MIN/MAX BOUNDS FOR DYNAMIC NETWORK FLOWS W. L. Wilkinson The George Washington University ABSTRACT This paper presents an algorithm for determining the upper and lower bounds for arc flows in a maximal dynamic flow solution. The procedure is basically an extended applica- tion of the Ford-Fulkerson dynamic flow algorithm which also solves the minimal cost flow problem. A simple example is included. The presence of bounded optimal arc flows entertains the notion that one can pick a particular solution which is preferable by secondary criteria. I. INTRODUCTION Ford and Fulkerson [1] introduced the notion of maximal dynamic flows in networks and provided an ingenious algorithm for solving the dynamic linear programming problem. A dynamic network consists of arcs and nodes with two nonnegative integers associated with each arc. One of the integers defines the capacity of the arc and the other the time required to traverse the arc. There are two dis- tinguished nodes in the network, one for the source where all flows originate and one for the sink where all flows terminate. If at each node the commodity can either be transshipped immediately or held over for later shipment, what is the maximal amount of commodity flow from source to sink in a specified number of time periods? Solutions constructed by the Ford-Fulkerson algorithm have the attractive property of being presented as a relatively small number of activities (chain flows which represent a shipping schedule) which are repeated over and over (temporal repetition) until the end of the allotted time span. A consequence of this temporal repetition is that a single arc flow value in each arc represents an optimal solution independent of how these arc flows are decomposed into chain flows. In networks of operational interest, these optimal arc flow values frequently have an upper bound different from the lower bound. These bounds say that you can always find an optimal chain flow solution which lies on or between the stated range and no optimal solution lies outside these bounds. The procedure set forth in the sequel calculates these boundary values for each arc. As shown in [2], the Ford-Fulkerson dynamic flow algorithm also solves the minimal cost flow problem. In this problem, roughly described, we are given a network having one or more sources and one or more sinks with availabilities at the sources and requirements at the sinks. There are inter- mediate nodes between the sources and sinks with connecting arcs having assigned capacities and unit shipping costs. The problem is to construct a feasible flow, if one exists, which minimizes cost in satisfying the requirements within the given availabilities. Similarly to the dynamic flow problem, the bounds on optimal arc flows will indicate the variety of ways, if any, in which such a feasible flow can be constructed. Before describing the computing procedure for bounded flows, we will give a more formal statement of the dynamic flow problem referred to above. 505 506 W. L. WILKINSON II. MAXIMAL DYNAMIC FLOWS Given the network G— [N; A] with source 5 and sink t of the node set N, we let nonnegative integers c(x, y) and a(x, y) be the capacity and traversal time, respectively, of each arc (x, y) in the arc set A. Let/(x, y; t) be the amount of flow that leaves x along (x, y) at time t, consequently arriving at y at T+a(x, y). Also f(x,x;r) is the holdover at x from r tor+ 1. If V(P) is the net flow leavings or entering t during the P periods to 1, 1 to 2, . . ., P — 1 to P, then the problem may be stated as the linear program: subject to Maximize V(P), 2 I 1705, T, t) -/(y, s; r-a(y, s))]-V(P) = 0, 7 = y ^[f(x,y,T)-f{y,x;T-a(y,x))]=0, x¥^s,t; t = 0, 1, ...,P, T=0 y =£/(*, t,t) ^c(x,y). Here a(x, x) = 1, c(x, x) = °° for holdovers at node x. If /(x, y; t) and F(P) satisfy the above constraints, we call /a dynamic flow from s to t (for P periods) and say that /has value V(P). If also V(P) is maximal, then /is a maximal dynamic flow. III. BOUNDED FLOW ALGORITHM To initiate the bounded arc flow computations, a maximal dynamic flow solution is required. The Ford-Fulkerson algorithm is used to obtain such a solution. We set forth their algorithm here, using the notation of [2], as Routine I in the interests of a coherent presentation for the convenience of the reader. Routine II is an application of Kirchhoffs first general law on the conservation of flow at any node in a network. Routine II calculates "slack bounds"; slack in the sense that flow values contained by these bounds may not all be optimal, however, no optimal arc flow values are excluded by the bounds. Routine II is very efficient in calculating bounds based on local information at a node and is retained for that reason. Routine HI tightens up these bounds to their true value where necessary. Routine III is, essentially, an application of Routine I (Ford-Fulkerson algorithm) to a subnetwork composed of a special set of admissible arcs from the original network. Using the original flow solution and the flow boundaries from Routine II, optimal flows are circulated through the admissible arcs about a selected arc being scanned. Treating the initial and terminal nodes of the arc being scanned as a temporary source and sink, optimal flows are maximized and minimized in the arc thereby determining the absolute upper and lower optimal arc flow boundaries. If at any time the varying optimal flow values in the admissible , BOUNDS FOR NETWORK FLOWS 507 arcs are observed to reach a bound of Routine II, that bound has been verified since a known solution lies on a bound which is suspect of being too loose. Consequently, several unscanned arcs may get scanned in the process of scanning a particular arc. The reader may note that Routine HI, slightly revised, could compute the true bounds without the aid of Routine II. Experience has shown that retaining the services of Routine II saves a substantial amount of computation time. ROUTINE I Ford-Fulkerson Algorithm Initial Conditions 1. Establish P, the time span of interest. 2. Set node numbers n(x) = for all x. 3. Define a{x, y) = ir{x) + a(x, y)--n(y) and consider as an admissible arc any (*, y) where a(x, y)=0. 4. Set all /U, y)=0. 5. During the routine a node is in one of the following states: Unlabeled and unscanned, Labeled but unscanned, or Labeled and scanned. 6. All nodes are unlabeled. Arc Flow Generating Routine STEP 1. To node s assign the label [+t, °°] and consider node s as unscanned. STEP 2. Take any labeled, unscanned node x and suppose that it is labeled [±u>, A]. To all nodes y that are unlabeled and such that: a. (x, y) is admissible and/(x, y) < c(x, y), assign the label [+x, min (h, c(x, y)—f(x, y))], or if b. (y, x) is admissible and/(y, x) >0, assign the label [— x, min (h,f(y, x))]. Consider node x as scanned and any newly labeled y-nodes as unscanned. Repeat until: a. node t is labeled (breakthrough), or b. no new labels are possible and node t is unlabeled (non-breakthrough). STEP 3. If breakthrough results and suppose node t is labeled [+y, h], replace /(y, t) by /(y» t)+h. Next turn attention to node y. In general, if y is labeled [+x, m], replace /(x, y) by f(x, y) + h, or if y is labeled [— x, m], replace/(y, x) by/(y, x) — h. Next turn attention to node x. Ultimately the node s is reached; at this point stop the replacement process. Starting with the new flows thus generated, discard the old labels and repeat the above Arc Flow Generating Routine until no new labels are possible and node t cannot be labeled. When this condition results, proceed with the following Non-Breakthrough Processing Routine. Non-Breakthrough Processing STEP 1. Calculate a value of 8 as follows: Define Ai = {(x,y) \xiX, yeX, d(x, y) > 0}, 508 W. L. WILKINSON ^2= {(x, y) \xeX, yeX, d(x, y) < 0}, where X is the subset of labeled nodes and X is the complementary subset of unlabeled nodes. Let 8, = min [d(x, y)], or P + 1 — jr(t) if A x = <f>, 8 2 = min [—a (x, y ) ] , or P + 1 — ir(t) if A 2 = (f>. (x,y)(A 2 i t <t> Then 8= min (8i, 8 2 ). Now define for all x, new node numbers it' (x) as 'flr(x) if* is labeled. min [ir(x) + 8; 7r(x) + P 4- 1 — 7r(t)], if oc is unlabeled. After new node numbers have been assigned, consider tt' (x) as ir(x). STEP 2. If 7r(0 < P + 1, return to the Arc Flow Generating Routine. If ir(t) = P+l, algorithm terminates. ROUTINE II Slack Arc Flow Bounds Initial Conditions 1. A maximal dynamic flow solution has been obtained for a particular P. Retain d(x, y) and f(x,y) for all (x,y). Retain tt(x) for all x. 2. Set for all (x, y) : '0/0, if d(x,y) > 0, G(x, y)lg(x, y) ={ c(x, y)/0, if d(x, y) = 0, c(x, y)/c(x, y), if d(x, y) < 0. 3. Add an arc (t, s) and set G(t, s) = g(t, s) = 2 /( 5 » y)« y 4. Order all nodes in increasing n(x) sequence with no preference where equality exists. 5. All nodes are unscanned. 6. If d(x, y) — 0, then (x, y) is an admissible arc. Procedure STEP 1. Take the lowest ordered, unscanned node x and to all admissible arcs calculate and insert into the arc record BOUNDS FOR NETWORK HOWS 509 (la) G'(x, y) = min[G(x, y); £ G(i, x) -^g(xj) + g(x, y) |, (lb) g'(x, y) = max[g(x, y); Y g(i, x) -J G(x, j) + G(x, y)\. If G' (x, y) < G(x, y) or g' (x, y) > g(x, y), consider y as unscanned.* Now consider the newly assigned G' (x, y) andg' (x,y) asG(x,y) and g(x, y), respectively. When all admissible arcs have been examined, consider x as scanned, proceed to the next lowest ordered unscanned node and scan that node. Scan all nodes. STEP 2. When all nodes have been scanned, then consider all nodes as unscanned. Take the highest ordered, unscanned node y and to all admissible arcs calculate and insert in the arc record (2a) C'U,y) = min[G(;t,y); £ G(y,j) -^g(i, y) +g(x, y)], (2b) g'(x,y) = m a x[g(x,y); ^g(y,j) -%G(i,y) +G(x,y)\. If G' (x, y) < G{x, y) ox g' {x, y) > g(x, y), consider* as unscanned. Now consider the newly assigned G' (x, y) and g' (x, y) as G(x, y) and g(x, y), respectively. When all admissible arcs have been exam- ined, consider y as scanned, proceed to the next highest ordered unscanned node and scan that node. Scan all nodes. STEP 3. When all nodes have been scanned, then consider all nodes as unscanned. Take the lowest ordered, unscanned node x and to all admissible arcs calculate and insert in the arc record the results of Equation (1). Consider x as scanned. If G' (x, y) < G(x, y) or g' (x, y) > g(x, y) consider y as un- scanned. Proceed to the next lowest ordered, unscanned node and scan that node. Scan all nodes. This terminates Routine II. Go to Routine III. NOTE: As described above, Routine II sweeps from source to sink, sink to source and then source to sink. Computational experience has indicated that three sweeps achieve the best compromise between best bounds and reasonable computing times. One could specify repetitive sweeps until a complete sweep had been made with no changes to G(x, y)lg(x, y). Alternatively, premature termi- nation is allowed at any point since we are only seeking approximations to the true G(x, y)lg(x, y). ROUTINE III Taut Arc Flow Bounds Initial Conditions 1. Label all arcs as follows: a. "Gg" it f(x, y) = G(x, y) = g(x, y), b. "C+" if/(x, y) = G(x, y) * g(x, y), c. "+g" if/(x, y) = g(x, y) ¥= G(x, y), or d. "++" if none of the above is true. The scan state of y may be either scanned or unscanned. If neither condition is met, then this state remains unchanged. This potential redundancy is necessary to accommodate the a(x, y) =0 instances where tt(x) = n{y), thus both arcs, (x, y) and (y, x) may be admissible. The stated inequalities prevent looping. 510 W. L. WILKINSON 2. Mark all arcs labeled "Gg" and consider them scanned. 3. An unmarked arc is an admissible arc. 4. All nodes are unlabeled and unscanned. 5. If all arcs are scanned, terminate the routine. Otherwise, go to the procedure below. Procedure STEP 1. Take any unscanned arc (x, y) and consider x as s' and y as t ' . a. If {$', t ') is labeled "+#" go to STEP (2) below and omit STEP (3) below. b. If (*', t') is labeled "G+" go to STEP (3) below. c. If (s\ t') is labeled "++" go to STEPS (2) and (3) below. STEP 2. To node t' assign the label [+$', G(s' , t') —f(s', t')]. Take a labeled unscanned node x, initially t' is the only such node, and suppose it is labeled [±w, h], to all nodes y that are unlabeled and a. (x, y) is admissible and f(x,y) < G(x,y), assign the label [+ x, m\n(h, G(x,y) — f(x,y))], or if b. (y, x) is admissible and/(y, x) > g(y, x), assign the label [— x, min(/»,/(y, x) —g(y, at))]. Consider node x as scanned. Repeat until node 5' is labeled (breakthrough) or no new labels are possible and node 5' is unlabeled (non-breakthrough). If breakthrough results and node s' is labeled [+y, h], replace /(y, 5') by/(y, 5') + h or if nodes' is labeled [— y, A], replace / (s ', y) by/(s',y) — h. In either case, if the arc (y, s') or (s', y) was previously considered scanned, it remains scanned bearing the label "Gg." If not previously scanned, consider the following cases. a. Node s' is labeled [+y, h]. If the new f(y, s') = G(y, s') and the current label is "+#," relabel the arc "Gg" and consider it scanned; or if the current label is "++," relabel the arc "G-f" and consider it unscanned. If the new/(y, 5') < G(y, s'), the arc retains its current label and remains unscanned. b. Node s' is labeled [ — y, h]. If the new/(s', y)=g(s', y) and the current label is "G-f," re- label the arc "Gg" and consider it scanned; or if the current label is "++," relabel the arc "+g" and consider it unscanned. If the new f(s'y) >g(s'y), the arc retains its current label and remains unscanned. Next, turn attention to node y and repeat the replacement and labeling process as for (y, s') or (s' , y), incrementing or decrementing the flow value by h and determining whether the arc is to be considered scanned or unscanned. Continue this replacement process until a reverse path to node 5' has been traced out. At this point, stop the replacement process. If /(«', t')=G(s', t') a condition for non- breakthrough exists. If not, then starting with the new flows thus generated, discard the old node labels, consider all nodes as unscanned and repeat this Procedure until non-breakthrough results. When non- breakthrough results, record G(s' , t') =f(s', t'). Erase all node labels, consider all nodes as unscanned and proceed to STEP 3 below if lb is satisfied. Otherwise, go to STEP 4. STEP 3. Do STEP 2 except assign the initial label to node s' as [—t',f(s', t')—g(s\ t')]> s t0 P the labeling and replacement breakthrough process at t' and, on nonbreakthrough, record g(s' , t') = f(s',t'). Then go to STEP 4 below. STEP 4. Consider (5', t') as scanned and reverted to its original (x, y) designation. Erase all node labels and consider all nodes as unscanned. Take the next unscanned arc (x, y) and repeat all of the above Procedure. If no such arc exists, terminate the routine. G(x, y) and g(x, y) are the firm upper and lower bounds, respectively, for optimal arc flows. BOUNDS FOR NETWORK FLOWS 511 IV. PROOF OF EXTREME ARC FLOW VALUES We stated earlier that the Bounded Flow Routine II has proven useful in reducing computing times. Since Routine III operates on the slack bounds produced by Routine II, it is necessary to show that optimal arc flow values are never excluded by Routine II. We will now prepare the way for stating such a theorem. LEMMA 1: The sequence of equation pairs for G' (x, y) and g' (x, y) in the Bounded Flow Routine II will produce monotonic nonincreasing values of G' (x, y) and monotonic nondecreasing values of g'(x, y) for all (x, y). PROOF: The truth of the assertion is a natural consequence of the structure of Equations (1) and (2) for G" (x, y) andg'U, y) where the upper limit for G' (x,y) is G(x,y) and the lower bound for g'(x, y) is g{x, y) . Independent of the number of iterations of the equation pairs, the previously calculated value for G(x, y) or g(x, y) provides a bound consistent with the assertion above. LEMMA 2: If G(x, y) 3= g(x,y) for all (x, y), Equations (1) and (2) will maintain G' (x, y) 3= g'(x,y) for all (x, y). PROOF: Consider the pair of Equations (1) for G' (x, y) and g' (x, y). The cases of interest are where: (1) G'(x,y) = ^G(i,x)-^g(x,j)+g(x,y)<G(x,y), i J and g'(x, y) = Yg(i, x) -yG(x,j)+G(x, y) > g(x, y). (2) r f Ignoring the inequalities and subtracting the second equation from the first, we have G' (x, y)-g' (x, y) = ^[G(i, x)-g(i, x)]+ £ [G(x,j)-g(x,j)l By hypothesis, each of the two summation terms on the right side is nonnegative, therefore, the dif- ference on the left side is nonnegative. Equations (2) are symmetric with (1) for calculations progressing from sink to source. A similar exercise to the above would produce equivalent results. That the hypothesis is true at the outset can be seen in the Initial Conditions established by Condition No. 2. Recall that we insisted that c(x, y) 3=0. LEMMA 3: When scanning a node x, the equations of Routine II always calculate ^G(x,j) 3= 5>(t\*). PROOF: Consider Equation (lb) which is g'(x, y) = max[g(*, y); ^g(i, x) -^G(x, j) + G(x, y)], i J or g'(x,y)-G(x,y)^%g(i,x)-^G(x,j). 512 W. L. WILKINSON We restate the above inequality by changing signs. Now, G(x, y) -g'(x, y) ^G{x, j)-^g{i, x). i i Lemma 2 states that G' (x, y) ^ g'(x, y) and Lemma 1 assures us that G(x, y) 3= G' {x, y). Therefore, the left side is nonnegative. Consequently, ^G(x,j) 3= ^g(i, x). 5 i Analogous arguments to those above for Lemma 3 would prove its corollaries for the following: ^C(i,jc) 3= ^g(x,j) by considering Equation (la), ' j ^G(y,j) ^^g(i, y) by considering Equation (2a), and i i ^ G(i, y) 3= ^g(y, j) by considering Equation (2b). ' j THEOREM 1: Let G(x,y) and g(x, y) be the upper and lower bounds, respectively, as determined by the Bounded Flow Routine II for all arcs (x, y) in G = [N; A ]. Then any temporally repeated maximal dynamic flow solution for this network will contain for all (x, y) a flow/(x, y) such that g(x, y) *£/(*, y) *£ G(x, y). PROOF: Required flows are initially set in the inadmissible arcs according to the optimality criteria, i.e., for those arcs where a(x, y) < 0, the minimum flow is equal to the arc capacity and for those arcs where d(x, y) >0, the maximum flow is equal to zero. Where d(x, y) =0, the flow may be anywhere between zero and the arc capacity. In the return arc from sink to source, we have set the minimum and maximum equal to the maximal static flow for the system for the time period of interest. These values do not change at any time. Initially for the admissible arcs the maximum flow is set at the arc capacity and minimum flow at zero which are the broadest possible bounds since we insist that < f(x, y) *£ c(x, y) for all (x, y) . Clearly then, our theorem holds for the initial conditions. Lemma 1 tells us that G(x, y) never increases, but tends to decrease and g(x, y) never decreases, but tends to increase while Lemma 2 maintains that G(x, y) 3 s g(x, y) for all (x, y) at all times. Lemma 3 and its corollaries insure that the available flow into a node is at least equal to the required flow out of a node and, conversely, the required flow into a node is no greater than the available flow out of a node for all nodes at all times. Since initially the feasibility of meeting local conditions of optimality exists everywhere, we will proceed by induction on the sequence of nodes to be scanned, i.e., s, Xi, X2, . . ., t, where Tr(xi) *£ Tr(xi+i). Where equality holds between two or more nodes and the connecting arcs have zero traversal times, there may be redundant scanning of any of the nodes, but Lemmas 1 through 3 will hold nonetheless for each scan. BOUNDS FOR NETWORK FLOWS 513 For the initial scanning of 5, we know that G(s, x) and g(s, x) are all valid in the sense of the theorem. Therefore the cases of interest are where G' (s, x) < G(s, x) and g' (s, x) > g{s, x). Accord- ingly, G'(s,x) = G(t,s)-^g(s,j), j * x or G'(s,x)+^g(sJ) = G(t,s). j*x Suppose we substitute then f(s,x)>G'(s,x) for G'(s, x). f(s,x)+^g(s,j)>G(t,s), j*x a contradiction to conservation of flow at 5 and the relative flow conditions maintained by Lemma 3. Similarly, g'(s,x)+"2G(sJ)=g(t,s), and we substitute /(s, x) < g' (s, x) for g' (s, x). Then As,x)+2G(s,j)<g(t,s), jftX again a contradiction as above. Thus we see that G' (s, x) and g' (s, x) are valid upper and lower bounds, respectively, for all x, whether or not they have been changed from their initial values. Next consider X\. If ^ G(i, xi) or V g(i, *i) are no longer their original values, we know from the i i above that they are valid. Again, the equation of interest is G'(x u y)+ J) g(x u j) = ^G(i, xx), j* y i and a substitution f{x\, y) > G' (x u y) constitutes a contradiction. Similarly, *'(*., y)+ 2 C(*„y) = 2*(».*i). j * V i A substitution f(x\, y)<g'(x\, y) produces a contradiction and G'{x u y) and g'(xi, y) are validated. 514 BOUNDS FOR NETWORK FLOWS As we proceed, we see that this holds for any Xj for the calculations are based on ^ G(i, Xj) and i 2 g(i, Xj) which, although possibly changed from their initial values, are known to be valid in the i sense of the theorem. At the pivot node t , we reverse our sequence and proceed to 5. Lemmas 2 and 3 insure that ultra- conservative conditions hold at this crucial point and a parallel argument to the s to t sequence would produce equivalent results. One may make as many iterative passes from s to t and t to s as desired without violating the conditions asserted by the theorem. This concludes our proof. We now turn our attention to the Bounded Flow Routine III. The scanning process of Routine HI is basically an application of the Ford-Fulkerson (F-F) algorithm (Routine I) as it operates between nonbreakthroughs. The maintenance of conservation of flow at every node and the maximizing proper- ties of this algorithm, are well established [2]. Our proof for Routine III then reduces to showing that there exists a formal equivalence between our application and the standard conditions for the F-F algorithm. Consider STEP 2 which is concerned with determining the best G{x, y). G(x, y) has replaced c(x, y) as the upper bound and g(x, y) has replaced zero as the lower bound. Justification for these substitutions can be found in Theorem 1. Our new network has all the arcs and nodes of the old except the arc being scanned. As in the F-F algorithm, only those arcs where a(x, y) =0 are admissible for labeling purposes because under the optimality criteria, it is only in these arcs that the flow can be altered. We start with a feasible flow as does the F-F algorithm. The source for this network is t' and the sink is 5'. The source gets labeled with the maximum amount of flow that can be augmented in (5', t') , i.e., G(x, y) —f(x, y) by Theorem 1. In the F-F algorithm, this is taken to be °° for it is not known a priori what the maximum amount of flow augmentation is. The labeling rules are the same as are the rules for breakthrough and nonbreakthrough. There is a distinction in the replacement process where the flow is incremented or decremented in a sequence of arcs which go from sink to sink. However, the last arc in this sequence is the arc being scanned which is not a part of the current network. When the routine reaches nonbreakthrough, further flow augmentation is impossible and we have the maxi- mum flow in the arc being scanned. The argument for STEP 3 follows that for STEP 2 above. The source for this network is s' and the sink is t ' . The source gets a negative label with the maximum amount the flow in (s', t ') can be reduced, i.e. ,f(s', t') —g(s', t') by Theorem 1. On nonbreakthrough, the flow in (s' , t') will have been decre- mented to its absolute minimum with respect to optimality and we record the final g(x, y) =f(x, y). We can now state the following theorem. THEOREM 2: Let G(x, y) and g(x, y) be the upper and lower bounds, respectively, as determined by the Bounded Flow Routine III for all arcs ( x , y ) in G = [ N; A ]. Then the integers n, such that g(x, y) ^ n =£ G(x, y) provide an exhaustive set of valid arc flows for which there exists an integer, temporally repeated, maximal dynamic flow solution for G[N; A], V. AN EXAMPLE In Figure 1 is shown a simple network and a dynamic flow solution for the stabilization time of P= 15. Following stabilization time, the static flow of 10 is repeated each time period and new solutions BOUNDS FOR NETWORK FLOWS 515 Figure 1. are not necessary since the arc flows do not change in value. The small lower numbers in the nodes are the node names and the larger upper numbers are the node numbers tt(x) 's. The data in the arc boxes are the following: Upper left— capacity; upper right — transit time; and the lower number is the arc flow/(x, y). Capacities and transit times are symmetric, e.g., c(x, y) = c(y, x). In Figure 2 is shown the bounded arc flow solution based on the flow solution in Figure 1. The net- work data is the same as Figure 1 except the lower numbers in the arc boxes are the upper/lower bounds G(x,y)lg(x,y). In decomposing all alternative routes for their maximum optimal flow, we get the following set of nine routes. Figure 2. 516 W. L. WILKINSON Possible chain Time length P=15 use Max flow 0-2-5-6 14 2 6 0-1-4-6 13 3 4 0-1-4-5-6 15 1 4 0-2-1-4-6 12 4 4 0-1-3-4-6 13 3 4 0-2-1-4-5-6 14 2 5 0-1-3-4-5-6 15 1 4 0-2-1-3-4-6 12 4 4 0-2-1-3-4-5-6 14 2 5 These alternative routes offer a fair variety of different ways in scheduling a particular optimal solution. For example, we list below two different solutions for contrast. Here, again, the time span is 15. Solution A Solution B Chain Flow Use Dynamic flow 0-1-4-6 0-2-5-6 4 6 3 2 12 12 24 Chain Flow Use Dynamic flow 0-1-4-5-6 0-2-1-4-6 0-2-5-6 4 4 2 1 4 2 4 16 4 24 Consider, for instance, that Nodes 1 and 2 are origins and Nodes 4 and 5 are destinations. Then if there was some preference, not formally stated, for maximizing the origin-destination deliveries 1-4 and 2-5, one would choose Solution A. However, if the preferred pairings were 1-5 and 2-4, Solution B is the best. VI. EXPERIENCE Some version of the Bounded Flow Algorithm has been in use at The George Washington University since late 1967. The algorithm and the associated computer codes have been revised several times with the objective of increasing their efficiency. Currently, the bounded arc flow computation takes approxi- mately one minute on a 500 arc network. The program is written in PL/1 for an IBM 360/50. VII. ACKNOWLEDGEMENTS The research was conducted as part of the Program in Logistics of the Institute for Management Science and Engineering, The George Washington University. The work was supported by the Office of Naval Research. Special recognition is due to Donald J. Hunt of the Program in Logistics who, since the very begin- ning of this development, has made material contributions to the power and efficiency of the computa- tional procedures. He is solely responsible for the large gains achieved in decreased running time which followed the original implementation of the algorithm. Thanks are also due to Raymond W. Lewis of the Program in Logistics for his valuable observations and suggestions during the various levels of algorithmic development. REFERENCES [1] Ford, L. R., Jr. and D. R. Fulkerson, "Constructing Maximal Dynamic Flows from Static Flows," Operations Research 6, 419-433 (1958). [2] Ford, L. R., Jr., and D. R. Fulkerson, Flows in Networks (Princeton University Press, 1962). PRODUCTION-ALLOCATION SCHEDULING AND CAPACITY EXPANSION USING NETWORK FLOWS UNDER UNCERTAINTY Juan Prawda Tulane University New Orleans, Louisiana ABSTRACT This paper extends Connors and Zangwill's work in network flows under uncertainty to the convex costs case. In this paper the extended network flow under uncertainty algorithm is applied to compute /V-period production and delivery schedules of a single commodity in a two-echelon production-inventory system with convex costs and low demand items. Given an initial production capacity for N periods, the optimal production and delivery schedules for the entire N periods are characterized by the flows through paths of minimal expected discounted cost in the network. As a by-product of this algorithm the multi-period stochastic version of the parametric budget problem for the two-echelon production-inventory system is solved. 1. INTRODUCTION In a recent paper Connors and Zangwill [7] developed the Network Flow Under Uncertainty (NFUU) or r-networks by allowing the requirements or availabilities at the nodes of a network to be discrete random variables with known probability distributions. They extended the standard deter- ministic multistage network flow problem introduced by Ford and Fulkerson [11]. The underlying structure of network flow problems was exploited in Ref. [7] to produce both a new structure which is not a deterministic network, but maintains many of its properties and a new node which replicates flows instead of conserving them. They called the former r-networks and the latter r-nodes. Construction of NFUU from a given N period stochastic problem is not given in this paper. The reader is referred to [7]. Given the W-period problem and convex objective criteria, we will develop an algorithm to cal- culate the network flow problem that minimizes expected cost. Two applications of this algorithm are given: 1) To compute optimal /V-period production and delivery schedules of a single commodity in a two-echelon production-inventory system with convex costs and low demand items; and, 2) To solve the parametric-budgetary problem in the multiperiod stochastic case, corresponding to the system described in (1). This paper is organized as follows: In section 2 the Convex Network Flow Under Uncertainty Algorithm is stated, its validity and convergency proven; in section 3 the A^-period, two-echelon pro- duction and delivery inventory problem is stated. In section 4 we extend the parametric budgetary problem to the multiperiod, stochastic case, for the system considered in section 3. 2. THE CONVEX NETWORK FLOW UNDER UNCERTAINTY ALGORITHM Let G— (N, A) denote a Network Flow Under Uncertainty (NFUU), where N is a finite collection of elements x, y, . . . and A is a finite subset of ordered pairs (x, y) of elements taken from N. N is 517 518 J. PRAWDA supposed to be of the form N=N t U N 2 U 7V 3 with N t D Nj=<b for i, j= 1, 2, 3, i 4=;'. The elements of ^Vi are called nodes, the elements n, r 2 , . . . of N% are called replication nodes or r-nodes and the elements C\, c 2 , . . . of Na are called collating nodes or c-nodes. Members of A are referred to as arcs. All arcs will be supposed to be of the form (x, y) with x =t= y, x, y in N. We exclude arcs (x, y) where both x, y are in Ni, i = 2, 3 and arcs going from r-nodes to c-nodes and vice versa. If x is in N, we let a(x) ("after x") denote the set of all y in N for which (x, y) is in A, that is, a(x) = {yeN\ (x,y)eA}. Similarly, we let b(x) ("before x") denote the set of all y in N for which (y, x) is in A, that is b(x) = {yeN\(y,x)eA}. Given G, each arc (x, y) in A has associated with it a nonnegative real number q(x, y) , called the capacity of the arc {x, y) in A; a nonnegative integer/(jc, y) called the flow of the arc (x, y) in A; and a nonnegative real number g(x, y), called the expected discounted cost of (x, y) in A. Both/ and g are functions from A to the nonnegative reals, the former having nonnegative integers as its range. Let 5, called the source, and t, called the sink, be two distinguished elements of TV. Each rk in N% has a single input arc and several output arcs and possesses the following two properties: (1) f(x, rk) — /(/"k, y) for all y in a(/>) and some x in 6(/>) and (2) g(x, rk) 3* 0, g(rk, y) = for all y in a(rk) and some x in b(rk). Property (1) merely states that flow on each of the output arcs of an r-node must be identical with that on the input arc, and (2) states that all the outgoing arcs of an r-node have an expected discounted cost of zero. Each c* in A^ is essentially the negative of an r-node, that is, it has several input arcs and a single output arc and possesses the following two properties: (3) f(y, Ot) = f(ck, x) for all y in b{ck) and some x in a(ck) and (4) g(ck, x) ^0, g(y, Ck) =0 for all y in b(c k ) and some x in a(c k ). Properties (3) and (4) are, respectively, the negative of (1) and (2). It is shown in [7] that (5) G=UG 4 U^ 2 UiV3, i=l where G l — (N(, A 1 ), i=l, . . ., M, M is a finite integer and G' D Gi = <b for i,j=l, . . ., M, i±j. NETWORK FLOWS UNDER UNCERTAINTY 519 Each G l , i=l, . . . , M is called a subnetwork and it is an ordinary network consisting of ordinary nodes x', y', . . . in N} and ordinary arcs (x } , y') in A' where x', y' are in N{. In each subnetwork G l the total inflow equals the total outflow, and the flow is conserved at each node in /V,', (i= 1, . . .,M). Two subnetworks, say G' and G>, i,j= 1, . . ., M, i =¥j, are connected if there exists at least one rk in N-2 or Ck in N 3 for which (x i , r k ) and (a, yj) are in A when x'eN{ and ycA^j or (aj j , Ck) and (c^, y') are in A when * j e/V/ and y'eNl. Let yVj= {a-e^ 2 | (r A -, x l ) or U\ r fc )e4, Ar j e/Vj} 7Vj= {cA-eA^a | (c fr , ac') or (*■', c fc )e4, Jc-'eA^}} be the sets of r and c nodes that connect subnetwork i (i=l, . . ., Af) with the rest of the NFUO. From the NFUU G= (TV; A) (Figure 1) we observe that Wi = { 1,2, . . ., 17, 18}, N 2 = {n, r 2 , r 3 }, A f 3={c,, c 2 , c 3 }, and M = 8. Figure 1. A network flow under uncertainty or r-network. The algorithm to follow is based on the works of Connors and Zangwill [7] and Hu [14]. It is very closely related to the works of Beale [2], Busacker and Gowen [4], Hu [15], and Zangwill [24]. This algorithm, utilizing the decomposition (5), iterates by determining shortest routes or routes of mini- mal expected discounted cost along the subnetworks (which are ordinary networks) forcing one unit flow on this route and making appropriate adjustments for the r and c nodes. 520 J. PRAWDA Let h[f(x, y)] be a nonnegative convex function onf(x, y) for all (x, y) in A, such that h(0) = and the flow function f(x, y) is required to have nonnegative integers as its range. The cost function for the entire network is X h[f(x,y)]. all Or, y)tA This cost function is a sum of convex functions and thus convex. Let (6) h[f(x,y)]=h[f(x,y) + l]-h[f(x,y)] iorf(x, y) 3= and all (x, y) in A, and (7) h[f(x,y)] = h[f(x,y)-l]-h[f(x,y)] iorf(x, y) > and all (x, y) in A. Expression (6) defines the up-cost of an arc and (7) the down-cost of an arc. It is shown in [14], that h{ ) > and h{ ) < 0; h(a) < h(b) for a < b and \h(a)\ < \h(b)\. A particular flow called a path-flow is a flow with/(s, x)=f(x, y)= . . . =f(z, t) = l and f(u, w)=0 for all u, w 4= s, x, y, . . ., z, t. If the cost of a flow with value i' is known and we superimpose a path flow on this given flow, the resulting flow has value v+ 1. h is used if the arc flow of the path flow is of the same direction as that of the arc flow of the flow with value v and h is used if the two flows are of opposite directions. The sum of h and h used in the path flow is called the incremental cost of the path flow. An iteration of the convex NFUU algorithm first requires construction of a modified r-network from the current flow in the original r-network. A shortest route algorithm is then applied to determine the shortest route from source to sink in the modified r-network using the up cost and down cost (h( ) and h{ ), respectively) of an arc as its length. One unit flow is then forced from source to sink in the original network along a route corresponding to the shortest path obtained in the modified r-network. The up cost and down cost of all the arcs in the modified r-network are redefined based on the new flow pattern of the original network. This cycle is repeated until the amount of flow at t in the original network is /.* Given G (the original network) and I (a positive integer corresponding to an input flow to the net- work), the precise algorithmic statements for the convex NFUU algorithm are: STEP (Initialization): Set/(*, y) = for all {x, y) in A. STEP 1 (Network Modification): Given the current flow in the original network f(x, y) for all (x, y) in A, define a modified r-network as follows: a) If =£ f{x, y) ^ q(x, y) leave the arc (jc, y) in the modified r-network with cost h[f{x, y)] as defined in (6) b) If/(*, y) = q{x, y) , delete the arc {x, y) from the modified r-network c)if0< f(x,y) and i) x, y are in N%, add a reverse arc (y, x) in the modified r-network with cost h[f(x, y)] as defined in (7) ii) x is Ni and y is N->, add reverse arcs (y, x) and (z, y) in the modified r-network for all z in "Definition of/ is given in the next sentence. NETWORK FLOWS UNDER UNCERTAINTY 521 a{y) , the former with cost h[f(x, y) ] as defined in (7), and the latters with cost zero iii) x is N 3 and y is N\ , add reverse arcs (y, x) and (x, z) in the modified r-network for all z in b(x), the former with cost K[f(x, y)] as defined in (7) and the latters with cost zero. Both b) and c) must be done if f(x, y) = q(x, y) for all (x, y) in A. STEP 2 (Shortest Route): Determine the shortest route or routes of minimal expected discounted cost from s to t in the modified r-network. Use properties (1) and (2) of r-nodes and (3) and (4) for c-nodes. Apply any shortest route algorithm [9, 11] with h( ) and h( ) as lengths in the arcs of the modified r-network. STEP 3 (Flow Augmentation): Send one unit flow from 5 to t in the original network along the route corresponding to the shortest path just obtained in Step 2, that is, along the path whose incremental cost in the modified r-network relative to the existing flow in the original network is minimum. STEP 4 (Iteration and Stopping Rule): If the amount at t in the original network is /, stop. Other- wise return to Step 1 with the current flow. If, during the application of Step 2, no shortest path exists from s to t in the modified r-network, the original problem is infeasible. The validity and convergence of the algorithm is proven in [7] for the linear case. The next theorem proves the validity and convergence of the algorithm for the convex case. Lety= 1 stand for the source s andy = m for the sink t. We will next prove that the convex NFUU algorithm is equivalent to compute a flow vector, /= (/i), i= 1, . . ., k (k is the total number of ele- ments in A) and a vector b = (bj),j = 2, . . ., m — 1, whose components are the amounts of flow at the nodes (bi — I, b m — — I), which Minh(f) subject to Df = b (8) //*£ q />0, where Df=b merely states that: a) the total inflow and total outflow of every node in N\ must equal the amount of flow at the node, b) every incoming flow (outgoing flow) of an r (c) node equals the amount of flow at the node, c) every replication* (collating*) flow of an r (c) node equals the amount of flow at the node, k q— (qi), i=l, . . .,k is the vector of arc capacities,/ is a£ X k identity matrix and h{f) = ^ [hi(ft)], where each M/i) is a real-valued convex cost function on/j, i= 1, . . ., k. THEOREM 1: Assume that at the end of iteration 5(5 > 0) of the convex NFUU algorithm, f s is a feasible solution to the convex programming problem (9) Min h(f) *The flow on each of the outgoing (incoming) arcs of an r(c) node is called replication (collating) flc 522 J. PRAWDA subject to Df=V />0. We let /° = and f s <I where / is the input flow to the NFUU. For this f s suppose /* is optimal to (10) Min/»(/) subject to Df=v 10 if arc i is not in the path flow 1 if arc i is in the path flow, ^here for i=l A and 10 if node j is not in the path flow 1 if node j is in the path flow, for j — 1, . . ., m. Then f s+1 — f s + f s is optimal for (11) Min h(f) subject to Df = b' + V /2*0. Pf: For /;(/) linear, the proof is in [7]. Assume h{f) convex with h(0) = 0. First we will prove /* +1 is feasible for (11). Adding the first m constraints of (9) and (10) yields NETWORK FLOWS UNDER UNCERTAINTY 523 D(f* + f°) = b' + r, or Df s+l = b' + -q. /* +1 is always bounded below by zero, since /* 5 s from (9) and/* 2= — /* from (10). The last 2k con- straints of (10) yield adding /■ 5= to the above inequality -//« + //• =£ //* + If* =£ q - /* + /*. The above inequality yields *s //* +1 ^ g , and thus/* +1 is feasible to (11). Next, we prove the optimality of/* +1 . Let the cost associated with problem (9) be h(f s ) and let h{f*) be the optimal incremental cost of problem (10) corresponding to the optimal path flow / s . It follows from the optimality of/* that h{f*)^h(f s ) for any path flow/* feasible to problem (10). Since the cost of the new flow/* +l in the original network equals the cost of the old existing flow /* in the original network plus the incremental cost of the path flow/*, in the modified r-network, it follows that Hf' +i ) = h(f'+p) = h(f') + h(f*)^h(f') + h(f») = h(f*+f*)=h(f*) for any feasible/* to problem (11). Then/* +1 is optimal to (11). Q.E.D. The last theorem proves that in terms of the NFUU, at the end of each iteration, the flow will be optimal for the amount thus far placed into the source of the NFUU. The convergency of the algorithm follows from the fact that/* +l is bounded above by a finite integer /and at each iteration / s+1 increases by a flow of value one. Next we suggest the application of the above convex NFUU algorithm to the solution of two prob- lems, one given in section 3 and the other one in section 4. 3. AN JV-PERIOD, 2-ECHELON PRODUCTION-INVENTORY SYSTEM Interest in the multiechelon inventory systems has been spurred by the existence of large mili- tary logistics networks and private industry. There are a number of papers on single and multiproduct, multiinstallation inventory models that have been published. A comprehensive review of these topics can be found in the excellent published bibliographies of Iglehart [17], [18], Scarf, Gilford and Shelley [20], and Veinott [22]. Several approaches have been used to compute optimal A-period reordering points in the pre- ceding multiechelon inventory systems. For instance, Bessler and Veinott [3], using the assumption that stock left over (backlogged) at the end of the A periods in each facility can be salvaged (purchased) at the same stationary unit price, decompose an A variable linear cost function into the sum of A — one variable linear functions, and a stationary policy given by a critical vector is shown to be optimal. 524 J- PRAWDA Relaxing Bessler and Veinott's [3] assumption, the Af-period problem will, in general, not de- compose to A — one period problems and dynamic programming is used to compute the optimal policy. Others, such as Clark and Scarf [5, 6] have used dynamic programming. However, its use has been shown to be computationally infeasible for even simpler problems than the one considered by Bessler and Veinott. (See [17, 18].) The objective of this section is to suggest the use of the previous convex NFUU algorithm to solve yV-period, multiechelon production and delivery inventory systems. It is the structure of the NFUU networks that allows for some computational improvement to obtain optimal production and delivery schedules of /V-period, multiechelon stochastic inventory problems, with low item demands, with respect to other techniques used to solve similar systems, such as dynamic programming. This paper is concerned with the problem of scheduling the production jcoi, *02, • • •■> x on and allocation Xn, . . ., x t \, x-zu ■ ■ -,X2.\,- ■ -,x n i,. ■ ., x h n of a single product in facilities 1 , . . ., rain suc- cessive time periods 1,2, . . ., /V so as to minimize the total expected discounted costs over the N periods. The requirements of each facility are discrete random variables each of which has known probability mass function. Figure 2 illustrates a two-echelon system consisting of a plant, a warehouse 0, and n facilities numbered 1,2, . . ., n. Although we are interested in more general multiechelon systems, the preceding one will suffice to illustrate our approach. Some remarks concerning the generalization to more complex multiechelon systems are given at the end of this section. KNOWN PRODUCTION CAPACITY FOR THE N PERIODS PLANT \= I, ,N 11 UNCERTAIN REQUIREMENT FACILITY I UNCERTAIN REQUIREMENT FACILITY n Figure 2. At the beginning of period one we consider a known production capacity for the N periods. Let / denote the production capacity. The production capacity could be present in this problem due to restrictions of raw material to produce the given single commodity. Uncertain requirements in facility i (i=l, . . ., n) in each period are satisfied insofar as possible from stock on hand at the beginning of the period in that facility and from the allocation and production of the commodity at the beginning of the period. Requirements which cannot be met in a given period (because, for example, limited production) are back logged until they can be satisfied by subsequent production or allocation in future periods. NETWORK FLOWS UNDER UNCERTAINTY 525 Let au (i — 1, . . ., n; t—1, . . ., N) be a parameter associated with each facility i in any period t. this parameter will take the following values: a«=l,2, . . ., n it fort'=l, . . ., n and t=l, . . .,N. Let {Du, i=l, . . ., n; t=l, . . . , N} be a family of discrete, nonnegative random variables. For a fixed i and t, Du takes on values in {d\, d-i, . . ., d^,}, a set of nonnegative real numbers. Let P^=P{Z) tt =<*„„} for alia,,. The sequence {Pa U } of real numbers will be a probability distribution of D it with P^ ^ 0, and "it This distribution is assumed to be known. Du is not necessarily independent* between any two facili- ties or identically distributed for successive periods. Let to,, A (an, a,2, . . ., a«), i=l, . . ., n and 1 =£ t «S N, be the index associated with the random variables defined below. This notation identi- fies the sequence of realizations in facility i up to period t. Thus for example wo A (2, 1, 5) denotes the sequence of following events in facility i: In period 1 realization 2, in period 2 realization 1, and realization 5 in period 3. Let (l)0t A (a>n, G>21> • • ., 0)nl, . . ., 0)u, <02t, • ■ ■ , <Ont) ■ We let D^ "" for i = 1 , . . . , n and t = 1 , . . . , TV be the vector of realizations caused by the stochastic requirements in facility i up to period t. It is conditioned on all previous realizations in that facility in previous time periods, that is£M" n) , . . •»#J. w ji'~ 1) and occurs with conditional probability P(a>if), where P(<o il )=P{D it = d a . l \D i , t - l = da i>t _ v . . .,Dn=d ail }. Let ^°' ) (3= 0) be the production completed in the plant (warehouse) at the beginning of period t given the sequence of random events &>o< and *j"" ) (3 s 0) (i= 1, . . . , n), the allocation completed in facility i at the beginning of period t {t=\, . . ., N) given the sequence of random events (oit. Let j(wot)( t=z ]^ ., N) be the inventory at the end of period t in the warehouse given the sequence o>o< and I u a,U) (i= 1, . . ., n; t = l, . . ., N) be the inventory at the end of period £ in each facility i given the sequence o>u. We will assume that the initial inventory at all facilities and the warehouse is zero at the beginning of period one. In order to simplify the statement of the problem, we are provisionally going to suppress the index (t>u and merely refer to the random variables %u and Iu, i = 0, 1, . . ., n, t = 1, . . ., N. *If for a fixed i, Du(t=l, . . ., W) is a sequence of dependent random variables, then the marginal distribution P„ jk (1 =S k S N) can be obtained from the given joint probability distribution. 526 J- PRAWDA We will assume that the lead time in production and delivery to the n facilities is zero. The inventory level equation becomes £ UiA-2 x kh ) for i = /i = l k=l t 2 {xih — Dih) for i*=l, . . ., n, where t— 1, . . ., iV. Let j8, be a nonnegative integer denoting the number of periods of backlog per- mitted for facility i, thus t I it ^- ]g D ik for all i=l, . . ., n. k = I - 0j + 1 Note that, in general, we would have <=i which imphes that it may be optimal not to use all the production capacity / through the TV periods. Let ku be the known capacity of the production line for i=0 and of the transportation facilities for i >0 (i = l, . . ., n) during period t (t= 1, . . ., N). Let Qu be the known storage capacity for the ware- house (t = 0) and the n facilities (i= 1, . . . , n) during period t (t= 1, . . ., N). Thus we will require that and xu < ku for i = 0, . . .,nandf=l, . . ., N Iu^Qu fori = 0, . . ., n andt=l, . . .,N. Let z = ( r «"01> ("O/V* J->11> 1"1N> J w nl> J^nJV' "1 \-*oi ' ' " '' *o\ '-*n ' • • •'•*!« ' - • ''ni ' • * "'•Sat / be the schedule vector for the entire system, given the sequence of realizations in all facilities i, i=l, . . . , n up to period N. We have the following costs in each period: production, shipping, holding, and shortage (the last one due to backlogged demand). The preceding costs are assumed to be convex functions of the quan- tities produced and delivered at the beginning of the period and of the quantities stored or backlogged *In terms of the NFUU this can be accomplished by arcs representing unused production capacity in the plant at each period t, t = \, 2 N. NETWORK FLOWS UNDER UNCERTAINTY 527 at the end of the period, respectively. The cost functions for successive periods need not be the same. (-1 Let 8 t , =£ 8, ^ 1 be the discount factor for period t. Let yi = l and y t = 8 ( , for t > 1. The j=i total expected discounted cost F{z) is defined to be the sum of the following expected discounted costs: i) Total expected discounted production and holding cost in the warehouse: 2 yr[C,(* M ) +#<«(/<»)]. ii) Total expected discounted transportation, holding, and penalty cost in all facilities: £i>' [Tu(xu)+H u (Iu)h where H it = Max {hit(Iit) , pa (lit)}, where hu{ • ), pu{ ■ ) are convex functions of their arguments satisfying r>0if/i,>0 haUit) -puVit) = if / l7 = l<0if/ft<0, hu(-) and pu(-) are, respectively, the expected holding and penalty costs,* for i — 0, . . ., n and t=l, . . .,N. Ct('), hit(') , Pit(') , and Tu(') are convex functions of their respective arguments with C t (0) = h it (0)=p il (0) = T it (0) =0for i = 0, . . ., n and t=l, . . ., N. Then the problem can be stated as: given the uncertain market requirements for each of the n facilities over the next /V periods and the production capacity for the N periods, find a production- allocation schedule z, called optimal, which minimizes the total expected discounted cost F(z) = J y* ■ [C t (x ot )+H 0( (ht) + 2 {Tu(x it ) + Hit(Iu)}] subject to ^ *<« ^ h *Hu(Iit) may not be convex for lu = 0. However, in terms of the network flow approach there are going to be arcs with expected cost hn( ■ ) corresponding to storage of inventory in facility i at the end of period t and different arcs with expected cost p«( ' ) corresponding to backlogged inventory in facility i at the end of period t. Thus Hn(lu) is never used explicitly in the NFUU. 528 J- PRAWDA O^xu^Ku for i = 0,-,. . .,nandt=l, . . ., N, and where Iu^Qu for i' = 0, •, . . ., n and t=l, . . ., N, X ( Xih ~ 2 Xkh ) for i = /i /« = ( ^ (xit — Du) fori=l, . . ., n h=\ and i it ^ -y />*. In order to solve the above problem we suggest that the above two-echelon multi-period, stochastic, production-delivery problem be rewritten in terms of the NFUU (see [7]), and then that the convex NFUU algorithm (described in section 2), be used to compute the optimal production and delivery schedules. Since the NFUU decomposes into a set of subproblems which are small network flow problems, this algorithm seems more attractive than dynamic programming, specially for multiechelon inventory sys- tems with low demand items (0-1 demands) and small number of time periods. It was shown in Ref. [7] that the amount of computer storage required by the NFUU algorithm is proportional only to n, the dimension of the one-stage problem. The approach suggested in this paper can be applied to more general multiechelon systems than the one depicted in Figure 2 (where transhipments between facilities are allowed), with consequent increase in the number of arcs and nodes in the NFUU. 4. A MULTIPERIOD, STOCHASTIC VERSION OF THE PARAMETRIC BUDGET PROBLEM Suppose that a fixed budget of v dollars can be allocated among the production-line, transportation, and storage facilities of the existing production-delivery inventory system for the purpose of increas- ing the production capacity / for the N periods. The cost of increasing the capacity of the production line (i = 0) and transportation facilities (i=l, . . ., n) during period t{t= 1, . . ., N) is Vi, t dollars per unit increase. The cost of increasing storage capacity at the warehouse (i = 0) and the n facilities (i=l, . . ., n) during periods {t= 1, . . ., N) is v g{ . ( dollars per unit increase. Let w it u and w a be decision variables corresponding, respectively, to the amount of additional capacity to be built in the production-line (i = 0), transportation (i= 1, . . ., n), and storage facilities during period t(t=l, . . ., N) dependent upon the sequence oiu of random events in facility i up to period t. Then the problem of increasing the production capacity for the W periods to, say, /' (/' >/) is to minimize the following expected expansion cost. NETWORK FLOWS UNDER UNCERTAINTY 529 n N (<■>«>_!_.. . Dt,. \ . «t (w tt>' MinX E ("«-PM-»7+"*« •£(•«) ' w «« ) subject to i*sr- , -7' <=i 0^^" ) <^ 1 r + ^r" ) for i = 0, 1, . . .,nand*=l W I^^Qu+w^ fori = 0, •, . . .,nand*=l, . . ., N, where 2 W 0- 2 4 W / A) ) for£ = /|r <"«)= ft = 1 A- = 1 I ^ Uir ,7) -^?"' ) ) fori=l, . . .,n ft = 'it Zj ik ' fc = f -0,-1-1 A related problem to the one just given above would be to maximize the production capacity/' (now a decision variable) for the iV-periods with a fixed budget of v dollars. This problem is stated as Max/' subject to n N £ 2 (vu'P((o u )'W^+v tit -P(m) • «&»>) = v 1 = t = 1 where =£*(?">*££„ + u><?"< ) for i = 0, •, . . .,raandf=l N I\fri*zQit+wl%ri forZ = 0, . . .,nandt=l, . . .,N, 2 (*fe tt) - 2 *fe? fc) ) fori = A = l /|f«) = 2 (*£"<> -0fe-«>) fori=l, . . .,* A = l k=t-0i+l 530 J. PRAWDA In terms of the NFUU G— [TV; A] the preceding two problems are seen to be an extension of the deterministic parametric budget problem solved by Fulkerson [13] and Hu [14, 16], to the multiperiod, stochastic case. The algorithm for solving the preceding two problems now follows: STEP (Initialization): Set/(s, y) = for all (x, y) in A. STEP 1 (Network Modification): Given f(x, y) for all {x, y) in A define a modified NFUU as follows: a) If/(*, y) < q(x, y) then h[f(x, y)] = 0; b) If f(x, y) 5* q(x, y) then h[f(x, y)]=g(x, y); c) U0<f(x, y) =S q(x,y) then h\f{x, y)] = 0; d) Uf(x, y) > q{x, y) then h [f(x, y)] = — g(x, y); where g(x, y) is the capacity expansion unit cost for arc {x, y) in A and q(x, y) is the original capacity of arc (x, y) in A. Obviously properties (1) and (2) of r-nodes and (3) and (4) of c-nodes must hold. STEP 2 (Shortest Route): Send one unit flow from s to ( in the original network along a route corresponding to the shortest path just calculated in the modified network, that is, the path in the modified network whose incremental cost is minimum. Apply any shortest route algorithm [9, 11] with h( ) and h( ) as lengths. STEP 3 (Flow Augmentation and Stopping Rule): If the amount of flow at t is /' in the original network or the total amount of money used up is v, stop; otherwise return to Step 1 with the current flow. It is obvious that*: wu or Ws it =f(x, y) — q(x, y) if f(x, y) > q(x, y) for some (x, y) in A, and wn or Ws u = if y(jc, y) =£ q{x, y) for some (x, y) in A. ACKNOWLEDGMENTS My sincere thanks to Professors Gordon P. Wright and Larry R. Arnold for their helpful com- mentaries. BIBLIOGRAPHY [1] Arrow, K., S. Karlin, and H. Scarf, (eds.), Studies in the Mathematical Theory of Inventory and Production (Stanford University Press, Stanford Calif., 1958). [2] Beale, E. M. L., "An Algorithm for Solving the Transportation Problem when the Shipping Cost over each Route is Convex," Nav. Res. Log. Quart. 6, 43-56 (1959). [3] Bessler, S. A. and A. F. Veinott, Jr., "Optimal Policy for a Dynamic Multi-Echelon Inventory Model," Nav. Res. Log. Quart. 13, 335-389 (1966). [4] Busacker, R. G. and P. J. Gowen, "A Procedure for Determining a Family of Minimal Cost Net- work Flow Patterns," Technical Rept. No. 15, Operations Research Office, Johns Hopkins University, Baltimore, Md. (1961). *We assume that the function mapping the subscripts (i, t) i= 0. •. •. . ., n, t=l, . . . , /V with the arcs (jr. y) in A is known from the structure of the NFUU. NETWORK FLOWS UNDER UNCERTAINTY 531 [5] Clark, A. and H. Scarf, "Optimal Policies for a Multi-Echelon Inventory Problem," Management Science, 6, 475-490 (1960). [6] Clark, A. and H. Scarf, "Approximate Solutions to a Simple Multi-Echelon Inventory Problem," Chapter 5 in Studies in Applied Probability and Management Science by Arrow, Karlin, and Scarf (eds.), Stanford, University Press, Stanford Calif. (1962). [7] Connors, M. and W. Zangwill, "Cost Minimization in Networks with Discrete Stochastic Require- ments," Operations Research 19, 794-821 (1971). [8] Dantzig, G., Linear Programming and Extensions (Princeton University Press, Princeton, N.J., 1963). [9] Dreyfus, S. E., "An Appraisal of Some Shortest-Path Algorithms," Operations Research 17, 395-412 (1969). [10] El-Agizy, M., "Dynamic Inventory Models and Stochastic Programming," IBM Journal of Research and Development (1969), pp. 351-356. [11] Ford, L. R. and D. R. Fulkerson, Flows in Networks (Princeton University Press, Princeton, N.J. 1963). [12] Ford, L. R. and D. R. Fulkerson, "Constructing Maximal Dynamic Flows from Static Flows," Operations Research, 6, 419-433 (1958). [13] Fulkerson, D. R., "Increasing the Capacity of a Network, The Parametric Budget Problem," Management Science 5, 472-483 (1959). [14] Hu, T. C, "Minimum Convex Cost Flows," Nav. Res. Log. Quart. 13, 1-19 (1966). [15] Hu, T. C, "Recent Advances in Network Flows," SIAM Review 10, 354-359 (1968). [16] Hu, T. C, Integer Programming and Network Flows (Addison Wesley Publishing Co., Reading, Mass., 1969). [17] Iglehart, D., "Recent Results in Inventory Theory," J. Indust. Eng. 18, 48-51 (1967). [18] Iglehart, D., "Recent Developments in Stochastic Inventory Models," Invited Paper at the Na- tional Meeting of ORSA, June 19, 1969, Denver, Colorado. [19] Prawda, J. and G. P. Wright, "On Some Applications of Network Flows Under Uncertainty," Proceedings of the International IEEE Conference on Systems, Networks, and Computers, Oaxtepec, Morelos, Mexico (Jan. 19-21, 1971). [20] Scarf, H. E., D. Gilford, M. Shelley, Multistage Inventory Models and Techniques (Stanford Uni- versity Press, Stanford, Calif., 1963). [21] Veinott, A., Jr., "Optimal Policy for Multiproduct, Dynamic, Nonstationary Inventory Problem," Management Science 12, 206-222 (1965). [22] Veinott, A., Jr., "The Status of Mathematical Inventory Theory," Management Science 12, 745-777 (1966). [23] Zangwill, W., "A Deterministic Multiproduct, Multi-facility, Production and Inventory Model," Operations Research 14, 486-507 (1966). [24] Zangwill, W., "The Shortest Route Problem under Either Concave or Convex Costs," Presented at the 12th Annual Operations Research Society of America Meeting, Santa Monica, California (1966). [25] Zangwill, W., Nonlinear Programming, A Unified Approach (Prentice Hall, Inc., Englewood Cliffs, N.J., 1969). CONCAVE MINIMIZATION OVER A CONVEX POLYHEDRON Hamdy A. Taha University of Arkansas ABSTRACT A general algorithm is developed for minimizing a well defined concave function over a convex polyhedron. The algorithm is basically a branch and bound technique which utilizes a special cutting plane procedure to' identify the global minimum extreme point of the convex polyhedron. The indicated cutting plane method is based on Glover's general theory for constructing legitimate cuts to identify certain points in a given convex poly- hedron. It is shown that the crux of the algorithm is the development of a linear underesti- mator for the constrained concave objective function. Applications of the algorithm to the fixed-charge problem, the separable concave programming problem, the quadratic problem, and the 0~1 mixed integer problem are discussed. Computer results for the fixed-charge problem are also presented. I. INTRODUCTION Consider the problem (1) min/U), where x= (*i, x 2 , . . ., x n ) and Q— {xeE n \ Ax= b, x^O}. The function of j\x) is assumed to be concave and well defined over the convex polyhedron Q. It is also assumed that the contrained minimum of f(x) is finite. The optimum solution to (1) is characterized by its occurence at an extreme point of Q. However, the principal difficulty is that a local minimum is not necessarily global. A method for solving (1) was proposed by Hoang Tuy [12], but with the additional requirement that f(x) be concave over all xeE n . Tuy's algorithm is started by identifying a local minimum point, x, of Q. A hyperplane cut (called Tuy's cut) is then determined and augmented to the problem so that all feasible (extreme) points in Q having a worse value than f(x) are excluded. Informally, Tuy's cut is generally defined by a hyperplane passing through the end points of the extended halflines emanating from the current local minimum such that the associated values of f(x) at these end points is equal to f{x). It is clear that/(jc) is an upper bound on the optimal objective value and that any extreme point xeQ having f{x) 3= f{x) cannot be promising. The process is then continued by searching for a local minimum of the new solution space resulting from the application of the last Tuy's cut. If no new local minima exist, the algorithm is terminated with the last local minimum being the global optimum. Tuy provides no convergence proof for the algorithm. This paper presents a new algorithm for solving (1). The algorithm is basically a branch-and-bound method which utilizes a special cutting plane procedure to identify the global extreme point of Q. The main difference between this work and Tuy's is that the cuts are generated solely from the geom- 533 534 H. A. TAHA etry of the convex polyhedron Q. Also, the identification of the candidate extreme points of Q necessi- tates defining a linear function which underestimates f{x). The linearity restriction is important since, as will be seen later, it reduces the problem to solving a series of linear programs. Although a method is given for developing a linear underestimator for the general case, illustra- tions for developing more efficient (or tighter) underestimators are also provided for important concave minimization problems. In section II, the generalized branch-and-bound algorithm is presented and its relationship to work by other authors is discussed. Section III introduces the cutting plane method associated with the algorithm. Section IV develops a general linear underestimator for/(*) and shows how "tighter" underestimators are developed for the fixed-charge problem, the separable programming problem, and the 0-1 mixed integer linear problem. Finally, section V illustrates the computational efficiency of the proposed algorithm as applied to the fixed-charge problem. II. THE ALGORITHM The general idea of the algorithm is explained as follows. Let l(x) be a linear underestimator of f{x) over Q; that is, (2) /(*)*£/(*), xeQ, then it is clear that (2') min {l(x)\xeQ}^ min {/ (x) \xeQ}- X X This means that by starting with the extreme point x° satisfying min{/(;t) |jce()}, /= /(jc°) is a lower bound on the optimum objective value of (1), while, from (2), an obvious upper bound is given by f—f(x°). Now consider jc'(# x°), an adjacent extreme point to x° such that /(*') yields the smallest l{x) among all the adjacent extreme point of x°. It is clear that only those adjacent point having/ =s l{x) *£/ need be considered in determining x l . The point x 1 is then said to be the next ranked ex- treme point. (The exact details of the general (cutting plane) procedure for determining the next ranked extreme points will be presented in section III.) Now, the new lower bound is f— lix 1 ). The upper bound/is changed to/(ac') only iff(x l ) is smaller than the current upper bound /=/(x°). In general, suppose E*' 1 — {x°, x 1 , . . ., x l ~ 1 } is the set of (nonredundant) extreme points thus far ranked. Then x\ the next ranked extreme point, is selected as the adjacent extreme point to one of the elements in E*~ l such that l(x { ) is again the smallest among all such adjacent extreme points and provided x' C\ E'' 1 = (f>, that is, x { is nonredundant with respect to E*' 1 . The current lower bound is now given by/= /(*'), but the upper bound is changed tof(x { ) only if this quantity is smaller than the best available upper bound/. The termination of the procedure is effected at x k if f = l(x k ) 3 s / with the extreme point as- sociated with /being the optimum. This follows since from (2), f(x)^l(x k )^f, xeQ*-E\ where Q* is the set of extreme points of Q. This condition shows that all the remaining extreme points (Q* — E k ) can only yield worse objective values than/, and are thus nonpromising. CONCAVE MINIMIZATION 535 The above discussion can be summarized in an algorithmic form as follows: STEP 0: Solve the linear program: min {l(x)\xeQ} X and let x° be the optimum extreme point. Define f(x°) = l(x°) as the lower bound on the optimum objective value of (1). Let x* = x°. Then f(x*) is an upper bound. Set i — 0, then go to Step 1. STEP 1: The current upper and lower bounds are given by f{x*) &n&f{x'), respectively. Let x i+1 be the next ranked extreme point of Q and set the new lower bound f(x' +i ) = l(x' +l ). Go to Step 2. STEP 2: If l(x i+1 ) 2* /(**), stop; x* is optimum. If f(x i+l ) < /(**) , set x* = x i+1 ,f(x*) =f(x i+1 ). Otherwise, the upper bound remains unchanged. Set i = i + 1 and go to Step 1. The general idea of the above algorithm was first proposed by Katta Murty [7] for solving the fixed-charge problem. Murty also indicated [7, Corollary 1] that for f{x) — D(x) + z{x), where z(x) is linear and D(x) is concave, if l(x) is taken equal to z(x), the algorithm is equally applicable. However, it is clear that Murty's corollary is true only ifz(*) =S z(x) + D(x). This obviously is not valid, in general. Later, Cabot and Francis [3] utilized the exact algorithm to solve the case where D(x) is a negative (semi)definite quadratic form. (See section IV.) The Cabot-Francis paper, however, presents the details of Murty's algorithm in a more explicit manner. The ranking procedure of Step 1, as advanced by Murty, determines the adjacent extreme points to each element (basis) in E' as the (new) basic solutions in which one of the current (eligible) non- basic variables is made basic. This requires carrying out a single pivot operation as in the simplex method. The major drawbacks of Murty's procedure is that the number of generated adjacent extreme points may become very large to the extent of taxing the computer memory. Moreover, because the same (adjacent) extreme point may be generated from more than one element in E', a procedure is needed to avoid storing redundant points. The extensive experimentation by this author shows that Murty's algorithm, as applied to the zero-one problem, generally yields very discouraging results (See [10]). The complex bookkeeping procedures required to economize the utilization of the computer memory and to minimize redundancy shows distinctly that the algorithm can very easily reach an unmanageable state. This paper differs from the work of Murty in two aspects: (i) It presents a general algorithm which solves any problem of type (1). This is in contrast with Murty's (or Cabot and Francis') work which leaves the impression that it can handle specialized con- cave problems only. (ii) It develops a new procedure for the details of Step 1 of the algorithm which improves on the drawback of Murty's ranking scheme. The new procedure utilizes a cutting plane technique which uses the "convexity cuts" recently developed by Glover [5]. It must be noted that, by using Murty's ranking procedure, the requirement that the underesti- mator l{x) be linear is needed only in Step 0. This follows since the theory of linear programming auto- matically allows the determination of the proper extreme point, x°. Clearly, the linearity assumption is not needed in the ranking procedure of Step 1. This is in contrast with the new ranking procedure, to be introduced in the next section, where the linearity of the underestimator is a mandatory require- ment. This follows since the ranked extreme points are determined by applying the dual simplex method of linear programming. 536 H. A. TAHA III. A CUTTING PLANE METHOD FOR RANKING THE EXTREME POINTS OF Q Informally, the idea of the new ranking scheme is explained as follows. Start with x° obtained at Step 0. Then define a cut that eliminates x° only from among all the extreme points of Q. The hyperplane associated with the cut is determined to pass through the adjacent extreme points of* . Now, augment- ing the linear programming problem with the cut and applying the dual simplex method, the resulting optimum feasible solution yields the next ranked extreme point. A new cut can now be generated from the adjacent extreme points by using the new solution space. The process is repeated as necessary. The above procedure is tailored after a recent development by Glover [5] who lays a general theory for constructing legitimate cuts which can be used systematically to determine certain points in a given convex polyhedron. A typical illustration is the convex polyhedron Q with its extreme points representing the set of points to be identified. Glover's theory actually generalizes earlier ideas by Young [13] and Balas [1] where they developed legitimate cuts for the integer linear programming problem. To formalize the above discussion, let the current basic solution be defined by the set of equations (3) y t = b i0 -^ b u tj, ieM, y„ tj 2* 0, where yt and tj are the basic and nonbasic variables, respectively. The sets M and N define the indices of the basic and nonbasic variables. The cut referred to above is now described based on Glover's theory: Glover's Convexity Cut Lemma [5]: Let S be a set of points in the convex polyhedron Q. If R is a convex set whose interior contains no point in S, and if y;=6,o, i*M (possibly a boundary point of/?) has a deleted feasible neighborhood which lies in the interior of/?, then for any constant t* > 0,jeN, such that yt= b i0 — bijt*eR, V ieM, the convexity cut t (4) ^ 'il'l * l excludes the extreme point yi=6io, ieM, but never any point in 5. The application of the above lemma to Step 1 of the algorithm is straightforward. Here the set S consists of the unranked extreme points of the current solution space. The point y; = bio, ieM, takes the place of the current "ranked" extreme point, and the set R is represented by the convex poly- hedron describing the current solution space. The determination of the constants t * in (4) follows directly from the theory of the simplex method; that is, t Because of the convexity requirement stipulated on the Set R, Glover coins the suggestive name "convexity cut." CONCAVE MINIMIZATION 537 bij > „• bio min _ii im bij all bij^O. Clearly, t* is strictly positive if the current solution is nondegenerate, that is, 6jo > 0. When bto= for at least one icM, then it is possible that t* = and the convexity cut (4) becomes undefined. In order to overcome the above difficulty resulting from a degenerate situation, we use the follow- ing procedure due to Balas [l].t Degeneracy occurs when an extreme point is "overdetermined," that is, when the current solution point has more than n hyperplanes associated with it, where n is the total number of variables. Balas [1] proves that by dropping each constraint for which the associated basic variable is equal to zero, the resulting convex polytope necessarily associates n distinct edges with the current solution vertex. Under this condition, the values off* are readily determined. Of course, when the cut is added, all the deleted constraints must be activated before the problem is reoptimized; unless such constraints are proved to be redundant with respect to R in which case they can be dropped completely. There are two important points which must be considered in association with the degeneracy prob- lem. These difficulties do not arise in Balas' case mainly because the sets/? andS in his problem remain unaffected by the deletion of the constraints associated with the zero basic variables. This obviously is not the case in our situation. (i) Let C be the degenerate cone associated with the current solution vertex, X, and define L as the polytope obtained from the current solution space of the problem by deleting the halfspaces asso- ciated with C. Further, define C as the nondegenerate cone associated with X which is obtained from C by deleting all the constraints satisfying Balas' condition. Since C C C , then the cut obtained from the adjacent extreme points resulting from the inter- section of C with L cannot be stronger than its equivalence when C replaces C . This means that the new cut cannot eliminate any of the extreme points of R(=C D L) which have not been tested for optimality. Subsequently, the cut obtained by using C is legitimate with respect to R. (ii) The cut obtained by using C will most likely create new extreme points which are not part of the vertices of the original solution space. The question then arises as to the possibility of the op- timal solution being "trapped" at one of these vertices. This point is refuted as follows: By the convexity cut lemma, such extreme points (when they occur) must lie on the halfline (5 ) y, = b i0 - bijtj , < t j < t* , ieM. If (5) is an edge (or segment thereof) of the original solution space Q (defined in (1)), then the new extreme point is actually a nonvertex of Q. Consequently, it cannot yield an improved solution point as this leads to contradiction. A similar argument applies if (5) is a new edge resulting from the application of previous cuts. tit must be noted that Murty's procedure overcomes the degeneracy problem by enumerating all the basic solutions associ- ated with the current extreme point. From the computational point of view, this has proved to be very time consuming (see [10]). 538 H. A. TAHA It is important to notice that the effect of degeneracy goes beyond simple inconvenience in compu- tation. Essentially, the creation of new extreme points must reduce the efficiency of the proposed method since it may be necessary to test these points for optimality (see the numerical example in sec- tion VI for an illustration). Consequently, it seems important that serious consideration must be given to minimizing the effect of degeneracy. The work of Thompson, Tonge, and Zionts [11] provides ways for eliminating degeneracy in certain situations (as illustrated by the numerical example in section IV). However, there does not yet exist a general method for handling all degeneracy situations. IV. DETERMINATION OF THE LINEAR UNDERESTIMATOR l(x) In this section we show how a linear underestimator l{x) can be developed for f(x) in the general case. Since the efficiency of the proposed algorithm should depend on the selection of the under- estimator, illustrations showing how tighter underestimators can be developed for an important class of concave minimization problems. This includes the fixed-charge problem, the separable programming problem, the quadratic problem, and the 0-1 mixed integer problem. (i) General Underestimator l(x): From the properties of the concave functions, a tangent hyperplane tof(x) at x [assume xeQ, where Q is the convex polyhedron defined in (1)] overestimates f(x). Consequently, it appears plausible that we can make use of a tangent hyperplane to g(x) =—f(x) (with modifications) to underestimate /(*). Let t a (x) be a tangent hyperplane to g(x) at a given point.t Clearly, for any x, (6) t!l (x)^g(x). Now a transition from g(x) to f(x) can be made if g(x) *£/(*)• Unfortunately, this is not true in general. However, if the values of x are restricted to those in Q, then the transition can be achieved as follows: PROPOSITION: Let M 3* be a real number, then there must exist a value of M < °° such that (7) -M + t g (x)^f(x), xeQ. In this case/(x) = — M+t„(x). PROOF: We need only show that —M + g(x) ^f(x) for all xeQ. The minimum (maximum) value o(f(x)(g(x)) occurs at an extreme point of Q. If minf(x) 2 s then maxg(x) ^ and obviously xtQ xtQ the desired result is achieved for M = 0. Now, suppose m'mf(x) < 0, then max g(x) > 0. By assump- X€« XtQ tion, f(x) possesses a finite minimum over Q. Thus M can be selected such that Ms* |min/(x)|. xtQ Since, by symmetry, |min/(a;)| = max g(x) it follows that, xtQ xtQ -M + g(x) **-M+maxg(x) =£ =s M+ min/(x) *£ Af +/(*), xeQ. xtQ xtQ Now, since M 2s |min/(*)| can be taken arbitrarily large, then letting M 2* 2|min/(jt)|, the desired xtQ xtQ conclusion follows immediately. t We further assume that the tangent hyperplane is determined at x satisfying Vg(jE) ^ 0, where Vg(*) is the gradient vector of g(x) at x. This will ensure that the resulting linear underestimator is not trivial. CONCAVE MINIMIZATION 539 The above proposition actually implies that a linear underestimator for/(*), xeQ, can be taken as /(*)=- Y mjXj-M, pi where rtij are positive constants and M ^ | min/(*) | . If mmf{x) ^ 0, then M can be taken equal to zero. xtQ xtQ Notice that since the lower bound on M is obviously not known a priori, one must rely on some prac- tical estimate to determine a numerical value for M . Although any values of m, > can be utilized, fur- ther research is needed to determine the best set of values providing the tightest linear underestimator. (ii) Fixed-Charge Problem: In the fixed-charge problem, f(x) is defined by f(x)=^CjX j + ^K j 8(x j ), N={1, . . .,«}, jfN jtN where 8(xj) — if xj = 0, and B(xj) — 1 if Xj > 0. The coefficients c, and Kj are real numbers with Kj > for ally. It can be proved that f{x) is a concave function which is continuous everywhere except at x = 0. Now, since Kj > by assumption, it follows that 2 CM < 2 CjXj + 2 Kjd (xj) , xj s* 0. jtN jtN jtN This shows that the linear underestimator can be taken as (8) l(x)=^CjXj. jtN Notice l(x) is valid for any Xj ^ 0. The application of the above estimator will be illustrated numerically in the next section. Notice that the same idea can be utilized to solve certain problems that often arise in inventory theory. A typical example is the finite horizon, multiple-item model in which price breaks (or quantity discounts) are allowed in the ordering function. This typically results in a piecewise-linear concave function. In this case, the linear segments representing the smallest per unit ordering cost can be used to determine l{x) . (iii) Separable Programming Problem: Let f(x)=2fj(xj) where fj(xj) is a concave and well defined function. Suppose now that the feasible range for each Xj as defined by the solution space Q is given by aj ^ Xj ^ bj, where a, and bj are known constants. Let lj(x) = ajXj + [ij be the straight line joining the two points (a,j,fj(a.j)) and (bj,fj(bj)). Since fj(xj) is concave, it follows by definition that h(xj)^fj(xj), xjcQ. 540 H A - TAHA Consequently, (9) l(x) = f i l j (x j ). It is noted that the fixed-charge problem discussed in (ii) satisfies the condition for a separable concave problem. Consequently, the above linear underestimator can also be used with the fixed- charge problem. Notice, however, that the present underestimator is tighter than that defined in (ii). This follows since it is defined for constrained values of x only as compared with x 2 s in the fixed-charge problem. (iv) Quadratic Minimization Problem: Let f(x)=z(x)+D(x), where z(x) is linear and D(x) is a negative (semi)definite quadratic form. If D{x) = xBx T , then B is negative (semi)definite. The linear underestimator in this case was developed by Gilmore [4] and by Lawler [6] and was subsequently utilized by Cabot and Francis [3] in connection with Murty's algorithm. Let bj be they'th column of B. Then D(x) can be written as D(*)=y (xbj)xj. Define Uj= min{xbj}. Hence xtQ (10) /(*)=z(*)+Y Ujxj^z(x)+D(x). N Notice that since Xj 3* 0, for ally, then, for any finite Wj =£ Uj, z(x) + V WjXj still provides a legitimate Pi underestimator. This result can sometimes be used advantageously to avoid solving n linear programs. An application of this situation is given below. (v) Zero-One Mixed Integer Problem: In this problem /(*) = J cjxj, xj =(0,1) for yeA/ 1 C N. je.V The function f{x) can be written equivalently as, /(*) = 2 <W + E (ci+M(l-Xj) )xj, M > and very large. jcN-Nl jtNl jfN jeNi jtNi CONCAVE MINIMIZATION 541 where Xj > 0, jeN, and Xj =£ 1 , jeN 1 . The expression M ( 1 — Xj) assigns a very high penalty to Xj for < Xj < l,jeN\ thus allowing it to take binary values only. The mixed integer objective function has been equivalently converted into a quadratic function in which the quadratic form A/^£ (— x 2 .) is clearly )tN\ J negative definite. Notice that all the variables in the new form are continuous. The above equivalence relationship was developed by Raghavachari [8] and independently by Taha [10] in an effort to secure a simpler formulation for the mixed 0-1 integer problem. The transformed f(x) is exactly in the same form as the function in (iv). Thus, using the method in (iv), Uj,jeN l , is defined by (11) Uj= min { — Mxj\xj€Q, «£ Xj *£ 1} 3= min { -Mxj\0 *£ xj =£ 1} = -M. x i Thus, taking Uj = — M, it follows from the development in (iv) that l(x)= V CjXj. This shows that l(x) can be taken asf(x) after removing the condition xj= (0, l),jeN\ Notice that if Uj is determined from the exact linear program in (11), the main difference would be that =£ x- 3 < 1 for some jeN 1 , that is, Xj = 0. This means that the new l(x) will be the same as above except that the indicated Xj are set equal to zero. The above result can also be derived on intuitive basis. Since for the 0-1 mixed integer prob- lem the optimum must occur at an extreme point, the integrality condition can be replaced by the continuous range s£ Xj =£ 1. It then follows that] ]£ cpc, 1 «£ x, ^ 1 , yWV 1 [must underestimate I ^ c jXj\xj= (0, 1), jeN 1 \ since the former is less restrictive. This, incidentally, means that the trans- formation of/(x) given above does not yield any privileged information and hence is trivial. Notice that by using the continuous range 0=£;tj=£l, jeN 1 , the resulting objective function be- comes linear in Xj over its feasible values. Thus the new objective function may be considered concave over the feasible space and the general algorithm in section II becomes applicable. In this case the upper bound /is defined equal to °° for any extreme point not satisfying Xj= (0, I), jeN 1 - The impor- tant point, however, is that the cut as defined in section III is uniformly weaker than its equivalence as developed by Balas [1]. On the other hand, the determination and use of Balas' stronger cut re- quires more complex computation as compared with ours. Consequently, the real merit of either cut can only be checked through computational experimentation. V. COMPUTATIONAL EXPERIENCE WITH THE FIXED-CHARGE PROBLEM This section illustrates the efficiency of the proposed algorithm by applying it to the fixed-charge problem. This special case is selected primarily because of its practical interest. In addition, the avail- ability in the literature of computational results for other fixed-charge methods allows a more meaningful evaluation of the proposed algorithm. In order to clarify the details of the algorithm, especially those associated with the degeneracy problem, we first introduce a numerical example. This will be followed by a presentation of the com- puter results as applied to randomly generated problems. 542 EXAMPLE: subject to H. A. TAHA where minimize f{x) =<f> i (x t ) +(^2(^2) 2xi + x 2 + Si = 4 xi + x-i + Si = 3 *£ x 2 ^ 5/2 Xi, Si, S2, ^ 0, <M*i) = 4*z(xi) — 0, *, = ■4xi + l, xi >0 0, x 2 = 1-3*2+1/2, * 2 >0. Thus, /(*) = — 4xi — 3*2. Using Dantzig's technique to accommodate the upper bound on x%, Table I gives the solution specifying x°. A graphical display of the solution is given in Figure 1. Table I s, s 2 1= -10 -1 -2 X\ — X-z — 1 2 1 -1 -1 2 x°=(l,2); Point© f(x°)=-\0 f(x°) =- 10+ (1 + y,)=-8V» /(x*)=/U°)=-8V2 Cut #1 is now developed. (Notice that the determination of the constants (t*) of the cut must be based on Dantzig's upper bounding technique.) Thus, S*=min{ 1/1, ^p, 00 I = 1/2 S*=min{2/2, », oo| = l and the cut is given by — + — ^1 V2 1 CONCAVE MINIMIZATION 543 CUT# CUT#2 CUT#3 I 2 3 *l FIGURE 1. Solution of the numerical example. Expressed in terms of x\ and x-z, the cut is (Cut #1) 5*, + 3x 2 =£l0; We denote the slack as S3. Table II yields x l as a result of augmenting Table I by Cut #1 and reoptimizing using the dual simplex method for upper bounded variables. Table II s 2 s 3 /= -9 l /2 -3/2 -1/2 x t = 1/2 -3/2 1/2 x% = 5/2 5/2 -1/2 s,= 1/2 1/2 -1/2 Jt'=(V2. 5 /2); Point® /(*')= -9V2 7(*')=-9V2+(l 1 /2)=-8>/(x*) f(x*) =f(x°) Notice that in Table II, Xi is basic at its upper bound. This means that the current solution is degen- erate. Using Balas' condition which, in this case, calls for ignoring the equations involving basic vari- 544 H. A. TAHA ables at upper bound or zero level, it is clear that the Adequation must be disregarded in developing cut #2. Thus, S*=min l~, oo, oo las 1 S* — min fV2 1 i 00 00^ = Ivi' ' J This yields a new cut which when expressed in terms of x\ and x% is given by (Cut #2) 6x, + 4jt2 + S 4 =12, S 4 2*0. Table III gives the new solution after Cut #2 is effected. Notice that x% — 5/2 — x' r Notice also that since S3 is associated with a previous cut and since it is basic, its corresponding equation can be dropped in future tableaus. Table HI s 4 1= -8 5 /e -1/3 -2/3 Xi = 1/3 -2/3 1/6 s 2 = 1/6 -1/3 -1/6 s,= 5/6 1/3 -1/3 s 3 = 5/6 1/3 -5/6 X*=(V3, */a); Point (2) /(.t 2 )=-8 5 /s 7(jt 2 )=-8 5 /6+lV2 = - /(**)=/(,») •7V 3 >/(**) Cut #3 is now generated from Table III. This gives (Cut #3) 30x, + 24*2 «S 60. The application of this cut will yield point (3) with x 3 = (2, 0) and/(* 3 ) = — 8. Since/(x 3 ) >/(**), the process terminates. Thus x* = x° is the optimum solution. Notice the effect of degeneracy at (7). Point (]) is (over) determined by the three lines jc 2 = 5/2, *i + *2 = 3, and 6xi+4x 2 = 12. Balas' condition drops *2 = 5/2. The cone C as introduced in section III, is then defined by the halfplanes x x + x 2 ^ 3 and dx x +4x 2 ^ 12, which yields Cut #2. The optimum point (2) is a new extreme point which does not belong to the original solution space. It is remarked that if the redundant constraint X\ + x-i =£ 3 is eliminated instead of x% =£ 5/2, then Cut #2 would have been stronger as it would pass through points ©and®. Stanley Zionts, in a pri- vate communication to the author, shows that by using the results in [11 J, this specific degeneracy situation can be avoided. The idea is as follows: Prior to constructing a cut constraint, if there is any degeneracy write the degenerate constraint so that the right-hand side element is zero. (In [11], methods for identifying redundant, (and of course redundant degenerate) constraints are provided.) Applying this to Table II, x-i is replaced by 5/2 — x' . In order for the redundant constraint to be implied in "defi- CONCAVE MINIMIZATION 545 nitional" form, x' 2 must now be made nonbasic with S 2 being the new basic variable. This yields Table IV (Table II revised). Table IV / -9V2 -3/5 -4/5 Xi V 2 -3/5 1/5 s 2 -2/5 -1/5 s, Vt 1/5 -2/5 Notice that the S2-row is redundant now and may be dropped from the tableau. But more impor- tantly, the generated cut is ^- + —3*1 5/2^5/2 ' which now passes through points Q) and®, thus bypassing the extra point (5) and its associated cut. Computer Results The testing of the algorithm as applied to the fixed charge problem is designed to check the effect of the size of the problem and the magnitude of the fixed charge on the speed of computation. Random problems of the type max{^ (c j x j + Kj8(xj))\^a i jXj^bi,Xj^0,i=l, ... to} are generated with their coefficients lying in the ranges 0«£c, ^999 *£ Kj ^ 160 -20 *£ a i} *£ 100 O^bt ^200. The sizes of the generated problem are given by (to X n) = (5 X 20), (5 X 30) , (10 X 20) , and (15 X 30). In order to test the effect of the fixed charge, the same problems are used again with K, replaced by 2Kj and 3£,-, respectively. No special structure is specified for the problems and the density of the matrix d= || a*/ 1 is at least 97 percent. The algorithm is coded in FORTRAN IV for the IBM 360/50. The results are summarized in Table I. One of the basic difficulties we encountered in coding the algorithm was the control of machine round-off error. This is important since a zero variable may be rounded to a positive value, thus affecting 546 H. A. TAHA TABLE V. Summary of Computation (Time in seconds) Problem number (mXn)= (5X20) (mXn)=(5X30) (mXn) = (10X20) (mXn)= (15X30) Kj 2K } ZK } Kj 2Kj 3K } Kj 2Kj 3Kj Kj 2Kj 3K, 1 0.300 0.500 10.70 0.484 0.513 0.683 64.917 65.117 150.317 40.600 44.150 702.500 2 0.216 0.250 0.250 0.366 0.467 23.650 22.833 22.950 127.950 9.184 11.400 11.167 3 0.250 0.283 0.450 0.467 0.483 0.483 1.417 192.117 192.117 59.000 66.183 72.350 4 0.216 0.717 1.717 1.530 2.633 3.116 70.650 77.117 81.980 3.650 417.183 702.500 5 0.317 0.317 2.783 0.866 0.750 1.550 1.550 2.000 33.967 19.984 135.433 803.117 Average 0.260 0.413 3.180 0.742 0.969 5.8% 32.273 71.86 97.266 26.482 134.868 327.552 the bounds directly. The problem was overcome by using double-precision computation as well as appropriate tolerances. Also, checks were implemented in the code to detect the accumulation of ma- chine round-off error. For example, an important check is to test whether at a given iteration the number of positive variables among the original variables exceeds the number of original constrainst. It must be remarked that the five problems in Table V were selected from among 20 test problems as the ones yielding the least amount of "disorder" from the viewpoint of machine round-off error. The remaining problems were excluded by the checks in the code because they indicated uncontrollable round-off error. It is felt, however, that a professional programmer should be able to develop a more efficient and accurate code than the one written by the author. Although the results in Table I are generally compatible with what one may expect; that is, the average computation time increases with the increase in the fixed charges, the individual problems are exhibiting peculiar behavior which needs explanation. For example, problem 3— (10 X 20) requires 1.5 seconds for Kj, 192 seconds for 2Kj, and again 192 seconds for 3Kj. This result can be justified as follows: The termination of the algorithm occurs at the extreme point x' when l(x') 2* / (x*). It is obvious that the computation time of the problem is primarily a function of the number of extreme points which are ranked before termination occurs. Thus, two problems having the same solution space, will require the same computation time if they terminate at the same x'. Notice that l(x) is dependent only on the linear terms of the objective function and that its value at an extreme point is not dependent on the fixed charges, while /(**) is directly dependent on the fixed charges. Consequently, if l{x l ) —/(**) for 2Kj is large enough to accommodate an increase in the fixed charges to 3Kj, termination still occurs at x' and the same computation time is consumed. Similarly, if l(x') —f(x*) for Kj is too small, an in- crease in the fixed charges to 2Kj may necessitate further ranking of new extreme points before ter- mination is effected. The results in Table I also show that the computation time increases more appreciably with the increase in the number of constraints rather than with the number of variables. These results differ from those associated with cutting plane algorithms in integer programming where the number or vari- ables is the main factor affecting the computation time. The reason for this appears to be that our CONCAVE MINIMIZATION 547 algorithm depends more directly on the number of extreme points of the solution space which is a function of both the number of constraints and the number of variables. For the sake of comparing our algorithm with other exact methods for the fixed charge problem, we only came across two algorithms by Bod [2] and Steinberg [9]. The two methods are of the branch and bound type. Bod's method utilizes what may be termed as a partial enumeration technique for test- ing all the extreme points (basic feasible solutions) of the convex polyhedron. The effective use of bounds on the objective value excludes most of the nonpromising extreme points. Steinberg's method, on the other hand, initiates two problems at each node according to whether the variable Xj is zero or positive. Bounds on the objective value are also used to effect the proper termination of the algorithm. Bod does not present computer results for his algorithm. But Steinberg tests two sets of problem with sizes (5 X 10) and (15 X 30) on the IBM 360/50. The average computation times per problem for the two sets are 10 sec and 21.1 min, respectively. This is far inferior to the average computation time obtained by our algorithm; esi> dally that Steinberg's algorithm can easily tax the computer memory. He reports that a set of 15 blems, with size (5 X 10) each, requires an average of 32 nodes while those with size (15 X 30) each require an average of 1,208 nodes. This shows that the number of nodes can become very large even for problems, with modest sizes. The problem is not present in our algorithm since, as in any cutting plane algorithm, the size of the matrix A at any iteration cannot exceed (m + n)xn. We must remark also that, contrary to our algorithm, Steinberg's algorithm becomes slower as the magnitude of the fixed charge decreases. He utilizes the ranges =S Cj =£ 20 and s£ Kj =£ 999 for his test problems, but does not study the effect of variations in Kj on the speed of computation. VI. CONCLUSIONS The algorithm presented in this paper is general in the sense that it can handle any concave minimi- zation problem over a convex polyhedron. If the computer results of the algorithm as applied to the fixed charge problem are at all indicative of its efficiency, it would appear that the algorithm can actually be used to solve practical problems. Further research is still needed, however, to develop the tightest linear underestimator for f(x). Also, since degeneracy is a pronounced problem in our algorithm, a general method is needed for treating the degenerate case without weakening the resulting cuts. This should improve the efficiency of computation considerably. VII. ACKNOWLEDGMENT The author wishes to thank Professor Stanley Zionts, State University of New York at Buffalo, for his helpful comments. REFERENCES [1] Balas, E., "Intersection Cuts — A New Type of Cutting Plane for Integer Programming," Opera- tions Research 19, 19-39 (1971). [2] Bod, P., "Solution of a Fixed Charge Linear Programming Problem," Proceedings of Princeton Symposium on Mathematical Programming (Princeton University Press, Princeton, New Jersey, 1970), pp. 367-375. [3] Cabot, A. V. and R. L. Francis, "Solving Certain Nonconvex Quadratic Minimization Problems by Ranking the Extreme Points," Operations Research 18, 82-86 (1970). 548 H. A. TAHA [4] Gilmore, P. C, "Optimal and Suboptimal Algorithms for the Quadratic Assignment Problem," SIAM Journal 10, 305-313 (1962). [5] Glover, F., "Convexity Cuts and Cut Search," Operations Research, 21, 123-134 (1973). [6] Lawler, E. L., "The Quadratic Assignment Problem," Management Science 9, 586-599 (1963). [7] Murty, K. G., "Solving the Fixed Charge Problem by Ranking the Extreme Points," Operations Research 16, 268-279 (1968). [8] Raghavachari, M., "On the Zero-One Integer Programming Problem," Operations Research 17, 680-684 (1969). [9] Steinberg, D. I., "The Fixed Charge Problem," Nav. Res. Log. Quart. 1 7, 217-235 (1970). [10] Taha, H., "On the Solution of Zero-One Linear Programs by Ranking the Extreme Points," Technical Rept. No. 71-2, University of Arkansas (Feb. 1971) revised May 1972. [11] Thompson, G. L., F. Tange, and S. Zionts, "Techniques for Removing Nonbinding Constraints and Extraneous Variables from Linear Programming Problems," Management Science 12, 588-608 (1966). [12] Tuy, H., "Concave Programming Under Linear Constraints," Soviet Math 5, 1437-1440 (1964). [13] Young, R. D., "New Cuts for a Special Class of 0-1 Integer Programs," Research Report, Rice University, Texas (Nov. 1968). ESTIMATION OF A HIDDEN SERVICE DISTRIBUTION OF AN M/G/oo SYSTEM* Laurence Lee George University of Louisville Louisville, Kentucky and Avinash C. Agrawal University of British Columbia Vancouver, B.C., Canada ABSTRACT The maximum likelihood estimator of the service distribution function of an A//G/°° service system is obtained based on output time observations. This estimator is useful when observation of the service time of each customer could introduce bias or may be impossible. The maximum likelihood estimator is compared to the estimator proposed by Mark Brown, [2]. Relative to each other, Brown's estimator is useful in light traffic while the maximum likelihood estimator is applicable in heavy traffic. Both estimators are compared to the em- pirical distribution function based on a sample of service times and are found to have draw- backs although each estimator may have applications in special circumstances. 1. INTRODUCTION Suppose customers arrive at a service system at instants T u T 2 , . . . T n , where {T,,} is a sta- tionary Poisson process with rate parameter A customers per unit time. Each customer is served upon arrival and there are sufficient servers. Service times are independently and identically distributed with some unknown distribution function G(t), f 2= 0. These conditions describe the A//G7°° service system. They are often found in self service systems. In design of such systems it may be necessary to determine the unknown service distribution. Direct observations on the service time for each customer that enters the system may not be possible because of the economic constraints or because of other factors such as introduction of unavoidable bias, or simply, the actual behaviour of the customers while in the system is unobservable. An example of the first case may be cars entering a freeway where the distribution function of the time spent by cars on the freeway is to be estimated and tracing each car individually to find the time spent on the freeway may be extremely expensive. A similar situation may also exist in any store where it may not be possible to follow each customer through the store. Another effect of making direct observations on service time is to bias observations as customers may become conscious of being observed. It is for these reasons that direct observations on service time may not be possible. The service distribution, therefore, is hidden and estimation must be based on information other than a sample of service times. *This research was supported in part by the Defence Research Board of Canada Grant Number 9701 -25, when the authors were at the University of British Columbia. 549 550 L. L. GEORGE AND A. C. AGRAWAL 2. MAXIMUM LIKELIHOOD ESTIMATOR OF THE HIDDEN SERVICE DISTRIBUTION WHEN A IS KNOWN AND OBSERVATIONS ON OUTPUT TIMES ARE AVAILABLE Mirasol [5] shows that the output of an Af/G/oo service system is a nonstationary Poisson process with (2.1) Pr (number of departures in (0, t) = n\ system initially empty) e-Wf.(KS> C{x)dx )« = — , n = 0,l,. . ., where G(-) is the common service time distribution function and A is the Poisson arrival rate. The intensity function of this time dependent process, \-G(t), is both nonnegative and nondecreasing and is bounded above by A the Poisson arrival rate. The likelihood function for a nonstationary Poisson process with ti, ft, . . . , t„ as the times of occurrence of events is given by the joint density function (2.2) /r„ m m (t\, tz, . . ., t„ ; \(t)) = Pr (observing events at t t , h, . . . t„ ; k(t)) = [nM^]-exp(-l\(*)<k), where A(f) is the intensity function of the Poisson events. The first step in the problem under study involves finding a function A(0, t ^ which maximizes the likelihood function given by Equation (2.1) for fixed ti, t-i, . . ., t n under the condition that MO, t ** 0, is nonnegative and nondecreasing. The maximum likelihood estimate of M*)> t ^ Q satisfying these conditions has been obtained by Bos- well [1] as, '=0 if0*5f<*i < = min {Af y X(f *)} if **< t<t k+1 , k =1,2,. . ., (n - 1) = M < oo if t ^ t „, where (2.4) and \(tk)— max min \-, ; ; \ UosH«3«nl («aT . . . -ran) ak = tk+i — t k , k = a, a+1,. . .,/8. It may be noted that in the absence of an upper bound M on the value of M'K the solution obtained will carry no meaning as (2.2) can be made arbitrarily large by setting k{t) — e > for t < t„ and setting k(t„) arbitrarily large. Therefore, let k(t) «£ M for some fixed positive number. HIDDEN SERVICE DISTRIBUTION 551 The maximum likelihood estimate MO for the function MO,* 3* 0, maybe used to estimate k-G(t), t 3 s 0, from output observations of an M/G/°° system during some interval [0, T], T > t. This will give an estimate of A ■ G(t ) for te[0, T\. To obtain estimates for \ • G(t) for small t, the output process should be observed for small t. For large values of t, the output becomes a stationary Poisson process at rate A. For large values of t, G(t) is estimated as 1. If the input rate A is assumed to be known (this may be estimated from input data) and it is also assumed that the system starts empty, then the maximum like- lihood estimate of the service distribution function G{t) is given by: (2.5) G(0 = min[^,l]- In case it is desired to relieve the assumption that the system starts emtpy at t = 0, one must consider the first outputs as order statistics from G(t), given the number in the system at fc=0 possibly mixed with outputs which arrive after t = 0. The hidden service distribution G(t) for an Af/C/« system may also be estimated by peeking at the system only at times t t , t 2 , . . . t„ and observing the number in the system N(t\), . . ., N(t„). This sequence may also be used for maximum likelihood estimation because the number of customers in the M/G/<» service system is also a nonstationary Poisson process with intensity function X(l — G(t)) nonincreasing in t. The maximum likelihood estimator A(0 from (2.4) may be made into a maximum likelihood estimator of a nonincreasing function by reversing the max-min operation in (2.4). From this estimate of Ml — G(t)), t 2= 0, one can obtain an estimate of G(t), t 2= 0. Simulated output times of Af/Z)/ 00 and M\M\<x> service systems have been used in calculation of the maximum likelihood estimators. Comparison of these estimates to the true service distribution and to another estimator is made in section 4. 3. PROPERTIES OF MAXIMUM LIKELIHOOD ESTIMATOR FOR G(t) The maximum likelihood estimator of K-G(t) is a step function with jumps at output times, 7\ *£ r 2 *£ . . . ^T„. The first nonzero value or the estimate of G(t) occurs at or after 7\, giving no in- formation about G(t ) for t < 7\. This limitation may be removed by taking observations on N repeated runs of the service system starting empty. The ordered output times for all runs are used in calculat- ing the estimator of G(t). The maximum likelihood property of this estimator still exists and the estimator G,v(0 is given by, (3.3) G N (t) = mm[k(t)INk,l], where N is the number of runs. A lower bound on the expected time of the first output from Af runs is the expected first input time in W runs 1/A7V. Extreme value theory suggests that asymptotically the time of the first observations on G(t) will become smaller as the number of sums increases. Let D t j be the departure time of the ith customer in the 7th run where i—1, 2, . . . n,j=l, 2, . . . N. Dij=Tij + Sij where 7\, and Stj are arrival and service, respectively. The first departure over N runs will take place at time given by min {Ty + Sy}. By extreme value theory, min {7y + Sy} will have Weibull distribution asymptot- i &j i &j ically, Gumbel [3], no matter what may be the distribution function for the random variable (7y + Sy) 552 L. L. GEORGE AND A. C. AGRAWAL provided (Tij + Sij) > 0. The expected value of a random variable x having Weibull distribution is given by (3.4) £(*) = <* -r(l + l//3), where a is the scale parameter and /3 is the shape parameter for the Weibull distribution. The scale parameter a can be estimated as the mth order statistic (m counted from bottom) for which (3.5) l-(m/Ar+l) = l/e= 1/2.718 .... As N increases m will increase which means the value of a, the scale parameter, will decrease. Thus the expected value given by Equation (3.4) will go to zero asymptotically for large values of N. In the context of the service system, this will mean that the expected time of first departure will asymp- totically decrease to the lower support of the distribution as the number of runs increase. In other words the mean of the minimum order statistic of a random variable is of the order of the quantile for which the probability value is I— — -J and thus will decrease to the smallest possible value of the random variable asymptotically with/V. A simple illustration can be given by considering service time to be a constant, to. The expected time of the first departure in N runs, each run with n observations is given by (3.6) Em\n{T ij + S ij } = E{min (Tij) + t } i&j = £{min (Tij)} + t = (ll\Nn) + t , where A is the arrival rate of a Poisson arrival stream. It can be seen from (3.6) that the expected time for the first departure in case of constant service time to converges to the lower bound of the support of the service time distribution faster than 1/(N+ 1) as long as n > 1. The proof by Marshall and Proschan [4] of strong consistency of the maximum likelihood estimate of a distribution function under the assumption of increasing failure rate may be applied to show that the maximum likelihood estimator of an increasing distribution function G{t) is strongly consistent at the points of continuity; i.e., (3.1) G N (t)=G(t), with probability 1, for a sufficiently large number of repeated observations N on the service system out- put starting empty. This may be done because the failure rate function r(t) given as HIDDEN SERVICE DISTRIBUTION 553 for an increasing failure rate distribution function F{t) corresponds to a nondecreasing intensity func- tion k(t) of a nonstationary Poisson process. In fact K(t) is the failure rate function of the distribution function of the event times conditional on previous event times. The maximum likelihood estimate of A(f) based on event times of a nonstationary, nondecreasing intensity Poisson process is the same as the maximum likelihood estimator of the failure rate function r(t) from a nondecreasing failure rate distribution. 4. NUMERICAL RESULTS AND COMPARISON WITH BROWN'S [5] ESTIMATOR 4.1 Brown's Estimator In Figure 1, the number of customers in the system N(t) is plotted against time t, Let the origin on the time axis be shifted to the right such that it coincides with the first output after the old origin 0. 6j , n H* Z 4 -* 5 - *t— z 3 *h *»-Z|-*H *»Z 2 ** « \ M TIME f -t Y 3 M 4 Y 4 — -t ^ "b w FIGURE 1. Number of units in the system N(t) at time V vs. timet. Yi, i= 1, 2, . . . , n is the time between the new origin 0' and the ith output point after the new origin. Zi is the time from Yi to the nearest input point prior to Yi. For a stationary input process and in- dependent identically distributed service times in steady state behavior, the Z, are independent and identically distributed. Let //(•) be the distribution function of Z,, £=1, 2, . . ., n and H n (') the empirical distribution function based on observations Z \,Z-z, ■ ■ -,Z n . Then, Pr[Zi > *] = Pr[time back from ith output to last previous input > x] = (l-H(x)) = Pr [no input in the interval of length x fl service takes longer than x] = e-^'(\-G(x)) or l-H(x) = e- k '(l-G(x)). 554 L. L. GEORGE AND A. C. AGRAWAL Thus, an estimate of H{ • ), given by H„(x), may be used for estimating the service distribution function G(x): (4.1) G(x) = l-e^(l-H„(x)). This estimate may not be nondecreasing. A nondecreasing estimate of G(x) is obtained by modifying Equation (4.1), (4.2) G„(x) = max[0, j=i2 max | .^ ) {l-e Xzi (l-^„(Zi))}]. 4.2 Numerical Results The maximum likelihood estimate of the hidden service distribution function G(x) was obtained for simulated operation of an M/M/°° system for various arrival rates X and service rates /a. The results shown in Figure 2 correspond to an arrival rate of \= 1 customer/ min and exponential service at rate 10 09- N = 5,n = 50 08 07 ^ 06 0.5 0.4 0.3 02 01 0.0 MLE EXPONENTIAL SERVICE /i = I PERS0N/MIN ARRIVAL RATE X =1 PERSON/ MIN i EMPIRICAL MAXIMUM LIKELIHOOD EXPONENTIAL 10 14 20 FIGURE 2. Maximum Likelihood Estimate (MLE). fx=0.1 customer/min. Simulation was carried out for five runs, each consisting of 50 observations. The estimated values G\(x) are plotted against output times T, for 1=1, 2, . . ., n. The empirical distri- bution function as well as an exponential distribution function for the service times are also plotted for the purpose of comparing the simulated results. It can be seen that the estimated distribution function is close to the empirical and the actual distribution function. Brown's estimator was simulated for an M/M/oo system in steady state with A = 1 customer/min. and exponential service at rate £(. = 0.5 customer/ min. Results are shown in Figure 3 and it is found that Brown's estimated distribution function is close the empirical as well as actual distribution function. Further simulations with different exponential service rates has shown that while Brown's method gives reasonable results for a system having service rate close to or larger than arrival rate the maximum likelihood estimator is useful for slow service rate systems having large numbers of customers in the HIDDEN SERVICE DISTRIBUTION 555 •o 1 1.0 09 08 3R0WN'S ESTIMATOR ARRIVAL RATE X = 1 PERSON MIN SERVICE RATE /a= 5 PERSON MIN - •^ ,<*:.. i r- .^0- | 0.7 0.6 0.5 -^ J---""" 04 -.- ^f^^ EXPONENTIAL 0.3 0.2 1 ... J ^'\ ""EMPIRICAL BROWN'S ESTIMATOR 0.1 n n i i i 1 1 1 02 0.4 0.6 08 10 02 14 Z| Figure 3. Brown's estimator. 18 20 system. This contrasting behaviour of the two estimators may be used in order to obtain better results by using Brown's estimator in case of fast service and maximum likelihood estimator in case of slow service. Simulation of estimators was also performed for constant service times. The same remarks as above apply to the usefulness of the two estimators relative to service rate. It was also noted that the maximum likelihood estimator converged to the true unknown service time from above while Brown's estimator was less biased. REFERENCES [1] Boswell, M. T., "Estimation and Testing Trend in a Stochastic Process of Poisson Type," The Annals of Mathematical Statistics, 37, 1564-1573 (1966). [2] Brown, M., "An Estimation Problem in Af/G/<» Queues with Applications to Traffic," Technical Rept. No. 59, Department of Operations Research (Cornell University, Ithaca, New York, 1968). [3] Gumbel, E. J., Statistics of Extremes (Columbia University Press, New York, 1968). [4] Marshall, A. W. and F. Proschan, "Maximum Likelihood Estimation for Distributions with Mono- tone Failure Rate," The Annals of Mathematical Statistics 36, 69-77 (1965). [5] Mirasol, N. M., "The Output of an M/G/<» Queuing System is Poisson," Operations Research 11, 282-284 (1963). THE SINGLE SERVER QUEUE IN DISCRETE TIME-NUMERICAL ANALYSIS HI Marcel F. Neuts * and Eugene Klimko Purdue University ABSTRACT This paper deals with the stationary analysis of the finite, single server queue in dis- crete time. The following stationary distributions and other quantities of practical interest are investigated: (1) the joint density of the queue length and the residual service time, (2) the queue length distribution and its mean, (3) the distribution of the residual service time and its mean, (4) the distribution and the expected value of the number of customers lost per unit of time due to saturation of the waiting capacity, (5) the distribution and the mean of the waiting time, (6) the asymptotic distribution of the queue length following departures. The latter distribution is particularly noteworthy, in view of the substantial difference which exists, in general, between the distributions of the queue lengths at arbitrary points of time and those immediately following departures. 1. INTRODUCTION This paper is a direct sequel to [2], to which we refer for a detailed definition and for the assump- tions of the finite, discrete time queue. For easy reference, we only give a summary of the notation here. NOTATION L\ Maximum number of customers allowed in the system at any time. All excess customers are lost and do not return. L-i Maximum duration of the service time of a single customer. Tj Probability that a service lasts for; units of time, 7= 1, . . ., L 2 . We assume without loss of generality that tl 2 > 0. Also n + . . . +r/. 2 = 1. K Maximum number of arrivals during a unit of time. It is assumed that K<L X . Pj Probability thaty customers arrive during a unit of time, j = 0, 1, . . ., K. We assume without loss of generality that po > 0. and p* > 0. Also po + . . • +Pa ; — 1. X n The number of customers in the system at time n+. Y n The number of time units until the customer in service at time n+ completes service. We note that O^Yn^Lz and that Y n = if and only if X n = 0. In [2], it was shown that the bivariate sequence { (X„, Y„), n 3= 0} is an irreducible, aperiodic Markov chain with state space { (0, 0)} U {(1, 2, . . .,Li)X(l, . . ., L 2 )}. Its transient behavior was dis- cussed and investigated numerically in [2]. In this paper we first discuss the stationary joint distribution of the queue length X„ and the residual service time Y n . *The research of this author was supported by the National Science Foundation, Contract No. GP 28650 557 558 M. F. NEUTS AND E. K1JMK0 2. THE EQUATIONS FOR THE STATIONARY JOINT PROBABILITIES OF X„ AND Y„. We denote the stationary probabilities by P(i,j) for i= 1, . . . , L\ and / = 1, . . ., L%\ P(0, 0) is the stationary probability that the queue is empty. The stationary joint density of X„ and Y n is the unique solution to the following system of linear equations. (Da. P(0, 0) = p [P(l, 1) + P(0,0)]. for 1 ^i*£ K, 1 <y<I s -l. c. P(*\./) = x Pr-^(^y+l) + ';'•,.; /'('"- ^), for K + 1 «s i i *£ U - 1, 1 =£ j =S L 2 - 1. d. P(L t ,j) = P(i,,y+ 1) + 5; (1 - x p*) Wj - vJ+ i) + 'v,; p(£i, u\ v=\ Jt = for 1 ^y '^ L 2 -l. P(i, U) = r,. t f-^-P(l, 1) + V P*-, + i />(*, 1)1, 11 - Po „„ for 1 *s i < A:. i>=i-A'+l P(i*L 2 ) = r Lt X p,_„ +1 />(*,!), forK+1 <»<Li-l. P(L,,L 2 ) = r, 2 2 fl-Jpt)^-^!,!), „=1 v fe=o /-1 /.j h. P(0,0) + ]T 2P(i,j) = l. i=l j=l The system (1) contains L t L 2 + 1 independent linear equations in L1Z.2 + 1 unknowns. We shall show that its solution may be conveniently expressed in terms of the solution of a homogeneous system of SINGLE SERVER QUEUE -III 559 L\ equations in L x unknowns. Moreover, the latter system has a particular structure which greatly simplifies its numerical solution. We denote by Pj the Li-tuple [P{l,j), . . ., P(L u j)] forj=l, . . ., L 2 . We also introduce the L\ X L\ matrices A and B defined as follows: Po P\ Pi po Pi po Pk Pk-i Pk Pk-2 Pk-i Pk-3 Pk-2 . . . . . . . . . P2 P3 1— po 1— po 1— Po PO Pi P2 PO Pi po Pk-2 Pk-i Pk Pk-3 Pk-2 Pk-i + Pk Pi Pz Po Pi po -^— 1— Po Pk-i Pk Pk-2 Pk-i Pk-3 Pk-2 1— Po— Pi— p 1-po-pi 1— Po 1 B Pk-2 Pk-i Pk Pk-3 Pk-2 Pk-i + Pk p 2 p 3 pi p 2 po Pi po In terms of A and B, the equations (16 — #) may be written as 560 M. F. NEUTS AND E. KLIMKO (2) Pj^Pj+iA + rjri^, 1*/«I,-1, P L = nJPiB. The latter system is equivalent to the equations (3) Pj^P^rwf-J, l^j<L 2 -l, v=j U v= 1 We now observe that both A and B are stochastic matrices, that A is upper triangular and that the matrix B has only one subdiagonal. We shall say, for brevity, that B is nearly upper triangular. Since r, + . . . +ri, 2 =l, and A is an upper triangular stochastic matrix, the matrix V r v A v ~ x is stochastic v= 1 and upper triangular. The stochastic matrix B is irreducible, so that the matrix (4) Q=2r*4"- l B, is irreducible and stochastic. Finally it is easy to verify that Q is nearly upper triangular. The vector P/. 2 is therefore proportional to the vector of the stationary probabilities of the matrix Q. The nearly upper triangular form of the matrix Q makes the numerical computation of the vector Pi. 2 — up to a positive multiplicative constant — particularly simple. The vector Pl 2 is proportional to the vector (ti, ti, . . ., fc,,), whose components may be computed recursively as follows (5) «i = l, t 2 = (1— 9ll)92~l'' t k = q k ~t k _ 1 Uk-i(l—qk-i,k-i)—'2 t t v q v , k -j , 3=££s=L, It is easy to verify that none of the entries qk.k-i,2 ^ k^ L x vanish, so that by using the first equation in (2), the vectors Pj, j=l, . . ., L% — 1, may be computed up to a common, positive multiplicative constant. Equation (la) is then used to determine P(0, 0) up to the same multiplicative constant. This constant may finally be computed using Equation (lh). The stationary joint density of the queue length and the residual service time is therefore determined. SINGLE SERVER QUEUE- III 561 3. THE STATIONARY DENSITY OF THE WAITING TIME The support of the stationary density {tVj} of the waiting time consists of the integers 0, 1, . . ., L x Lz. Clearly w o = P(0, 0) and for 1 «£./'«£ LiL 2 , the density may be written symbolically as the con- volution polynomial (6) {wj} = P(l,-)+P(2,')*{r v } + P(3,-)*{r v }^+. • . + P{Li, •)•{*}«*-», where {r„} is the density of the service time. The numerical computation of the uij, l^j^ L\L 2 , by using a convolution analogue of Horner's algorithm for polynomials was discussed in [2]. 4. THE STATIONARY DENSITY OF THE NUMBER OF LOST CUSTOMERS PER UNIT OF TIME Since the waiting room is finite, it is possible that customers will be lost due to the waiting room being full at their arrival time. It is therefore of interest to know the stationary density {<£,} of the number of lost customers per unit of time. It has its support on the integers 0, 1, ... K and may be determined by the explicit expressions (7) <pj=2 Pk %P(L x -k+j,p), l^j^K, A (pO=l-^ Ipj. Knowing the joint density discussed in section 2, the probabilities {<pj} are readily computed. 5. THE STATIONARY DENSITY OF THE QUEUE LENGTH AT DEPARTURES The probabilities associated with the queue length at departure times, are primarily of interest in the analytic treatment of queues of M|G|1 type. Although they are frequently examined, their in- herent applied interest is limited. As we shall indicate below, the density of the queue length following departures may easily be obtained from auxiliary quantities which are computed in the process of evaluating the joint stationary density, discussed in section 2. In view of the importance ascribed to this density in the applied queueing literature, we decided to investigate its computational aspects. Note the very substantial difference which may exist between it and the stationary density of X n . The queue lengths following departures form an irreducible, aperiodic Markov chain with state space {0, 1, . . ., L\— 1}. Let us denote its transition probability matrix by T. Furthermore, let 6k(i, v) be the probability that in k consecutive units of time during which no departures occur, v customers join the queue, given that the queue length at the beginning of the first unit of time was i. The entries of T are then given by (8) To } = '2 i r k y t pK(l-po)- 1 k (h,j-h+l), forO<;<Ii-l, A=1 A=1 562 M. F. NEUTS AND E. KLIMKO 1.2 Tu= £ r k O k (i,j-i+l), for 1 *£ i^j+ 1, A = 1 7\j = 0, for i>j+ I. We note that the transition probability matrix T is nearly upper triangular. The stationary proba- bilities corresponding to T may be calculated by a simple recursion such as in Formula (5). In order to evaluate the entries of the matrix T, we first show that (9) e,(iJ-i+l) = (A k )i, i+l , forl^i*£L 1> 0*£y^L,-l, where A is the upper triangular matrix defined in section 2. For k—1, we find that (10) i (iJ-i+l)=p J -u. i , torO^j-i+l^KJ^Li-2, K = 2 P»> for L x -K^i^ Li, j=Lt-l, v=l.,-i = 0, for all other pairs (i,j), so that Equation (8) holds for k= 1. Furthermore (11) 6 k + l (i,j-i+\) = Y k (i,v-i+l)pj- p , v= maxTo, j-K) for 0«y=sL,-2, and '-. K dk + i(i, Li— i) = V 6 k {i,v — i) V p/,, P=T, -A' h = L t —v lor 1 =S i =S L|. When expressed in terms of the matrix /4, Formula (11) proves (9) inductively. The matrix T can be compactly written as (12) T=cf t r k A", where Cy=pj(] — />o) _1 for 1 =£y '^ K; Cj_i,,= 1, for 2 *£ i =£ Li, and Cy = 0, for all other pairs (i,j). The relation between the limiting distribution of the queue length following the nth departure and the stationary queue length distribution is noteworthy. A well-known theorem, from Reference [3], states that in a stable M \G\\ queue with single arrivals, the queue length at time t and the queue length following the rath departure have the same limiting distribution as t and ra respectively tend to infinity. An analogous result holds for provided that the probability of ; by an exact analogue of the argumenl In the case of group arrivals (K tions. Theorems which relate those processes, hut the resulting formi t offer as an illustration some numeri customers. We considered a queue with L t Although the traffic intensity p for th shows that this queue converges very The limiting distribution of the > In contrast, the limiting distribution addition, we list a summary of the nui ary probability of at mosl k custoi following a departure from the system 563 ussed by Dafermos and Neuts [1], I time is zero. This result is proved we shall omit the proof. en those two limiting distribu- using the theory of Markov renewal not pursue this topic here, but we i has rare arrivals of large groups of Po = 0.975, p>o = 0.025, r, = r 2 = 0.5. xamination of the transient behavior ' leparture has a mean equal to 32.2864. time n has a mean equal to 24.1752. In ttionary ctatributions. 77* is the station- ary probability of at most A: customers, 1 30 1 0.91 : The greater limiting probability paradoxical at a casual reading. A m be anticipated in stahle queues with be zero for long intervals of time becaus< flowing departures may appear to be owever that, on the contrary, this is to example, the queue length will typically ». The averaging procedure involved in the stationary distribution of the queue lei n heavily favors the lower values of k. The limiting distribution of the queue length folio be rath departure effectively ignores the long idle periods and results primarily from t!i during the service of the large groups of customers. The high probabilitiei of k in this distribution are therefore not surprising. ... This example strikingly shows that ion of the queue features may be of limited practical value, even in very stable lost realizations of the queue length process in our example will exhibit very substantia' . reflected in the asymptotic distri- 564 M. F. NEUTS AND E. KLIMKO butions. The practical questions related to queues of this type can only be answered after analyzing their transient behavior. The exclusive concern with asymptotic results in "practical" discussions of queueing theory is therefore regrettable. 6. COMPUTATIONAL ORGANIZATION In order to minimize both the computation time and the required memory storage, we took ad- vantage of the highly structured form of the matrices Q and T in Equations (4) and (11), respectively. The basic matrix is the upper triangular polynomial matrix Q* — {q*j} (13) Q*=^t rA v ~\ The rows of this matrix are similar in the sense that (14) qt,i+v = q£ v +i fori/=0,l,2, . . . L,— i — 1; i = 2,3, . . .,L,-1. Furthermore the matrix Q* is stochastic, so that (.5, -r,^-ff Therefore, the first row determines the entire matrix. This permits the storage of 0* using only L\ memory spaces, rather than the (Ljf + Li)/2 spaces required for an arbitrary upper triangular matrix. The resulting saving in memory space is substantial for large queues and in fact makes the analysis of queue lengths up to 800 feasible. Computation of the matrix Q* is performed by using Horner's method for the formation of polynomials, i.e., by recursive computation as follows (16) Q 1 *=(r^ + r f . 2 _,7) Q*n=Q*n->A + r L2 - n I n = 2, . . .,1,-1. Each of the successive matrices Q * is completely determined by its top row. The right-most elements are not needed and therefore are not computed. The top row entries of Qt are rapidly calculated by means of the formulas mln (K,j) (17) 0# B+1) =yprf i ! ( f )+ n<.-«P* iorj^k i = min(K,j) <7u (n+1) =2 Pi&V, for K<j^L u 1 = The matrix Q has the form SINGLE SERVER QUEUE- III 565 qu Q12 9i3 qu 921 ?22 923 924 921 922 923 q 2 i 922 9i,t,-4 9i,*-i-3 9i,t,-i 92,/-,-3 92,L,-2 92,t,-l 92,Li-4 92,f,,-3 93,L,-1 92,L,-5 92,t,-4 94,ti-l 921 922 9^1-2, L,-l 9i,-2, t, 921 9^i-i, /-i-i 9t,-i,t, qL t ,L,-l qLj.Lt where the third through the last rows, except for the last two columns, are essentially repetitions of the second row. The last column is determined by the condition that the rows sum to one. We therefore need to compute and store only the first and second rows and the (Li — l)-st column. This requires 3Li— 4 memory cells for the storage of the Q matrix rather than the L t + (Li + 2) (L t — l)/2 required for an arbitrary nearly upper triangular matrix. The top row elements of Q are given by (18) 9u : l-po 9n min(K,j-l) i=0 for j^K 2 P^ti-nv <=o for K<j<U-\ I ti-l \ K 9i,/.,-i = Po ( !~ X 9 u ) + S Pilt^-r, * j=l ' i=l the second row elements of the () matrix are (19) min(/c,j-l) 92i= X P<i-i' forl*£,=£Li-2; <=0 and the (Li — l)-st column elements are calculated by using (20) 9i,/.i-i — Po ('-S'l) min (K,Li-i) + Pfc9i,/c,-fc, fc=i for 2*Si*£ U. The stationary probabilities of the Q matrix were determined using Formula (5) and its compact representation by Formulas (18)-(20). For this purpose, a subroutine called STAPROB was written. The resulting stationary probability vector was identified temporarily with the vector Pl 2 . The vectors 566 7Y 2 , ■ ■ • , P\ were successively obta this computation essentially only the top row of the matrix A is needed. The nuil (21) P(i L z ), for./=/.2-l !• Finally /'(0, 0) is con tali'zation condition (I/O. The waiting-time distrib I ubroutine called WAIT. This subroutine was adapted from the >es where L,L 2 is large, one may wish to print only the percentage points of th A routine to do this was also written. The computational procedun leparture is similar to that for the stationary queue length distribution. Li (22) is first computed and then the matrix Tis -nled in a manner similar to that of the matrix Q. Only a modicum of additii solved. The stationary distribution is then calculated by the suhroutine STAPROB. Testing In addition to testing the program ' c compared the stationary probabilities with the transient probabilities after 60 r were obtained by the methods developed in [2]. Computational Experience Practical limits on the pro! memory requirements. The available memory space of J50K octal oximately. This permits, for instance, queue lengths of size 800 with service 5 points. For problems of this magnitude the computation time was a limiti n of the waitingtime distribution. We ran examples, both with and without ngtime. The central processing times on the CDC 6500 at Purdue Univer shown in Table 2. 7'i and T 2 are the actual program running times in seconds (withoc and loading times), respectively, with and without the computation of the waitingtime di ». For the example with Li = 800, Z. 2 = 25, /C = 4, the time 7\ was in excess of 3,000 see. I the computations were not completed even then. In all the examples, we used the same arrival distribution po = 0.8, pt — ih — P3 = p4 — 0.05. The service time distribution for the first three examples wai ,==0.05, and r 5 — 0.175. In the last example, the service time distributi ometric with p = 0.5 and the residual probability was added to / ?:,. SINGLE SERVER QUEUE -HI Table 2 567 L, U K T, T 2 100 5 4 5.751 0.945 200 5 4 22.539 2.221 400 5 4 69.774 6.612 800 25 4 > 3,071.032 26.290 7. CONCLUSIONS Large discrete, single server queues in the stationary phase may be analyzed numerically. As we have shown, most queue features of interest, with the possible exception of the stationary waiting-time distribution, can be computed without the use of excessive processing times. This should be contrasted with simulation methods which are inherently ill-suited for the study of the stationary phase. The prohibitive processing times required for the waiting-time distribution in large queues, raise the interesting question of how to evolve efficient numerical procedures for the evaluation of expressions of the general type which appear frequently in stochastic models of varied applied interest. Finally, the example discussed in section 5, shows that in queues exhibiting large fluctuations, it may be hazardous to base conclusions on a single stationary distribution. In such cases one should study the transient behavior, whenever possible. For further information on the algorithms discussed in this paper, one may contact either of the authors at the Department of Statistics, Purdue University, West Lafayette, Ind. 47907. REFERENCES [1] Dafermos, S. and M. F. Neuts, "A Single Server Queue in Discrete Time," Cahiers du Centre de Recherche Operationnelle 13, 23-40 (1971). [2] Neuts, M. F., "The Single Server Queue in Discrete Time — Numerical Analysis I," Nav. Res. Log. Quarterly, 20, 297-304 (1973). [3] Takacs, L., Introduction to the Theory of Queues (Oxford University Press, New York, 1962). SOME EXPERIMENTS IN GLOBAL OPTIMIZATION James K. Hartman Naval Postgraduate School Monterey, California ABSTRACT When applied to a problem which has more than one local optimal solution, most non- linear programming algorithms will terminate with the first local solution found. Several methods have been suggested for extending the search to find the global optimum of such a nonlinear program. In this report we present the results of some numerical experiments designed to compare the performance of various strategies for finding the global solution. I. INTRODUCTION It is frequently the case in applied optimization studies that an algorithm which is known to con- verge to a global optimal solution under certain conditions (such as convexity) will be applied to a prob- lem which does not satisfy these conditions. In particular, optimization problems which are suspected of having several local optima in addition to the global optimum are often solved using algorithms which will stop and indicate a solution whenever any local optimum is reached. In such cases a useful strategy is to repeat the solution process several times starting from different initial points to avoid accepting a solution which is only a local optimum. This is probably the most frequently suggested strategy for avoiding local solutions. There are also other strategies for avoiding the local solutions in favor of the global optimum. This paper describes some numerical experiments which were done to compare the performance of several strategies for organizing such a global optimization. II. THE PROBLEM In order to develop and test strategies for avoiding local solutions it is necessary to specify a class of optimization problems to be considered. This paper will concentrate on the "essentially uncon- strained" nonlinear programming problem (1) minimize f(x) subject to xeScE", where the local and global optimal solutions to (1) are known to occur in the interior of the set S. In such a problem the feasible region S determines a domain to be searched for solutions, but the bound- aries of S do not determine the solutions. In this sense problem (1) can be considered "essentially unconstrained." The simplest way to specify the set S is to place upper and lower bounds on each variable. Since each of the strategies to be considered will involve random selections of x, it is necessary to confine the search to a bounded region S. In addition, search strategies S5 and S6 will partition S into smaller regions; these two strategies can only be conveniently described for S determined by upper and lower bounds on the variables. 569 570 J- K. HARTMAN ''Essentially unconstrained" problems arise frequently as the "unconstrained" subproblems in interior penalty function algorithms such as the Sequential Unconstrained Minimization Technique of Fiacco and McCormick |3J. In the SUMT method, if the original nonlinear program is not a convex program, then the subproblem (1) may have local solutions which are distinct from the global solution. For problems like (1) a local optimal solution can be obtained by applying any of the efficient unconstrained descent algorithms (such as the Davidon-Fletcher-Powell method) to minimize the func- tion /(.x) while being careful not to penetrate the boundary of S. We shall now consider several strate- gies which try to ensure that the local solution we finally accept is, in fact, a global minimum. III. STRATEGIES FOR AVOIDING LOCAL SOLUTIONS Six different strategies for organizing a global optimization are compared in this paper. These are briefly described below with references to more complete descriptions when they exist. STRATEGY SI (From the folklore): a. Set k= 1. b. Let x k be a vector chosen at random in the search region S. Starting at x k perform an uncon- strained minimization search on the function/^) terminating at the local minimum x k *. c. Replace k with A: + 1 and go to step b. At each stage retain the best local solution obtained to date. SI is the strategy suggested in section I. Intuitively the problem with this strategy is that it may re- peatedly search to the same local minimum if the starting points x k happen to be chosen within the "range of attraction" of that local minimum. The next three strategies attempt to solve this problem. STRATEGY S2: a. Set k=\. Let/* be the objective function value for the best local solution so far obtained. Initially /*= + °°. b. Randomly select points xeS until one is found with f(x) </*. Call this point x k . c. Starting at x k perform an unconstrained minimization search terminating at a new local mini- mum x k *. d. Set/* —f(x k * ) , replace k with k + 1 , and go to step b. In S2 a minimization (step c) is initiated at x k only Hf{x k ) is smaller than the best solution/* found to date. Hence, each successive minimization gives a new local minimum which is better than any found so far. The same local minimum cannot be located twice. It is, however, much more difficult to deter- mine the starting points x k for strategy S2 than for SI. STRATEGY S3 (Bocharov [1]): a. Choose x 1 randomly in S. Set k=l. b. Starting from x k perform an unconstrained minimization terminating at the local minimum c. Choose a direction vector d k €E" at random and consider/^* * + ad A ) as the positive scalar a increases. Moving away from x k * in direction d k , the funtion/must initially increase (since x k * is a local minimum). Continue to increase a until/begins to decrease when a=a k . d. Let x k+1 = x k * + a k d k , replace k with k+ 1, and go to step b. STRATEGY S4 (Bocharov [1]): S4 is the same as S3 except that in step c, instead of choosing the direction at random, d k is GLOBAL OPTIMIZATION 571 chosen to be the direction of overall progress from the most recent minimization (2) d k = x k *-x k . Both S3 and S4 attempt to prevent repeated minimization to the same local optimum by moving out of the region of attraction of the most recent local solution before starting the next minimization. By continuing in the direction (2), Strategy S4 hopes to also avoid local minima detected before the most recent minimum. Strategies S5 and S6 are considerably different from the first four methods. While S1-S4 attempt to choose good starting points for repeated local minimizations, S5 and S6 attempt to gain information about the entire search region S, gradually concentrating their attention on portions of S which are, in some sense, "likely" to contain the global minimum. S5 and S6 are most easily described for problems where S is determined by lower and upper bounds on each variable: S={xeE n \lj^x i ^L i , i=l, . . .,n}. For ease of presentation we will restrict our attention to such problems. STRATEGY S5 (Piecewise Coordinate Projection — Zakharov [5]): a. Set up on initially empty list of points, and let S = {xeE"\li ; =£ %\ i =£ L,, i=l, . . ., n} be the "remaining feasible region." Let S = S initially. b. Randomly choose N points x k eS, compute/^* 7, ) for each, and adjoin them to the list. c. For each coordinate x, of x(i=l, . . ., n) separate the remaining feasible interval [/,-, L,] into m equal subintervals. Let Xij—{x k in the list whose ith component is in the _/th subinterval of [/,-, Li]} = {x k \(j-l)(Li-l i )lm^x k -h<j(L i -h)lm} for i=l, . . ., n and./=l, . . ., m. Then Xn, Xii, . . ., Xi, n describe the projection of the list of points x k into the m subintervals of the ith coordinate axis. d. By considering {f(x k )\x k eX;j}(i=l, . . ., n; j=l, . . ., m) select the subinterval set X* which is considered most likely to contain the global minimum. Briefly, this is done by selecting the subinterval set for which the average functional value is smallest, being careful to avoid choices based on insufficient information (for more details see Zakhorov [5]). e. By redefining /., and L„ delete the subinterval sets X S j (J=l, . . ., m;j ¥^ t) from the remaining feasible region. Delete each point x k in the list whose sth coordinate is in a deleted subinterval X S j. Go to step b. As the remaining feasible region S gradually shrinks, the global minimum will be more and more closely bracketed. The problem with this method is that the most promising subinterval must be determined on the basis of the sample of points x" chosen so far. There is always a chance that a sub- interval chosen for deletion will, in fact, contain the global minimum solution, and once it is deleted it can never be recovered. Strategy S6 attempts to solve this problem by retaining the entire region S throughout and using a probabilistic allocation device to concentrate attention on areas in S which are most promising. This algorithm is new and is still under development. Initial results show some promise, but considerable improvement is still necessary. 572 J K. HARTMAN STRATEGY S6 (Coordinatewise Allocation): a. Define a marginal probability distribution function <I>, on the feasible interval [/,-, £,,■] of each coordinate axis i=l, . . . , n. In the absence of other information, a uniform distribution seems rea- sonable for the initial distribution. b. Randomly choose TV points x k eS and compute/^) for each. The probability distribution func- tions <Pi govern these choices in that the ith component x k of x k is chosen as a random sample point from the distribution <!>,•. Thus, the 4>; determine the allocation of trial points to various regions in S. c. Based on the results of the trials to date, modify the <P, to increase the allocation of future points to regions considered likely to contain the global minimum. Go to step b. Strategy S6 can have many realizations depending on the method of handling step c. In the version of S6 reported in this paper, step c is performed as following for each coordinate i= 1, . . . ,n. 1. The feasible interval [/,-, Lj] is split into m subintervals. 2. A "success" is defined as a value of f(x k ) in the bottom 25 percent of aUf(x k ) values, and the ratios ry of the number of successes in subintervaly of coordinate i to the total number of points in sub- interval j are computed for all i and j. 3. The modified probability for subinterval j of coordinate i is given by py = ry/ ^ ry the normalized success ratio. Several improvements on this allocation scheme are being considered for future testing. In early tests it became apparent that performance of the various strategies fluctuated considerably, depending on the particular test problem under investigation. For example, relative to the other strat- egies, S2 performed spectacularly on some problems but miserably on others. On closer examination it was found that S2 did well on problems for which the global/value was significantly lower than the local minima and for which the global region of attraction was quite large; that is, on problems which were rather easy to solve. This suggests the need for a benchmark strategy to be used for assessing problem difficulty. The benchmark strategy should have as little structure as possible. We have chosen to use the pure random search method for this purpose. STRATEGY SO (Pure Random -Brooks [2]): a. Set *=1. b. Randomly select x k eS. Evaluate f(x k ). c. Replace k with k + 1. Go to step b. At each stage retain the best/ value found to date. This strategy may be regarded as a benchmark method since it makes no attempt to take advantage of the information gathered at previous stages. In this sense it is probably the most primitive strategy possible. We can use SO in two ways: 1. If a strategy does not do considerably better than SO, it should be discarded. 2. If a test problem is such that SO can solve it nearly as well as the other strategies, then the prob- lem is not very difficult and probably is not useful for discriminating among strategies. IV. COMPUTATIONAL EXPERIMENTS A number of computational experiments were performed to compare the various strategies pre- sented above. For each of the test functions employed, each strategy was run 30 times with different GLOBAL OPTIMIZATION 573 random number sequences. A run was allowed to continue until the algorithm had required 1,000 evaluations of the objective function/(ac). Test problems with predictable local and global solutions were constructed using the objective function j=m This function consists of the superposition of m modes, where mode j has depth CjeE 1 , position PjtE", and shape and width determined by the nX n negative definite matrix Aj, Particular test functions were obtained by choosing the parameters Cj and Pj from a random number table. Aj was chosen to ensure that the m modes were narrow enough that they did not completely merge into one another. Strategies SI through S4 require an unconstrained minimizer. Since the purpose of the study is to compare global strategies, a minimizer is desired which uses the same information as is available to the other strategies — function values, but not derivatives. Powell's derivative free method was selected [4]. V. RESULTS The computational results obtained are summarized in Tables 1 and 2. Table 1 gives characteristics of the test problems used. Table 2 lists for each problem and for each strategy the best/value obtained after 200, 500, and 1 ,000 function evaluations. Each value is the average of the 30 trials conducted for that problem and strategy. The percentage of the 30 trials which did not locate the global minimum after 1,000 function evaluations is also given in Table 2. It is difficult to obtain a single measure of perform- ance for this kind of problem since we must balance speed of convergence against the chance that the global solution will be missed entirely. Table 1. Characteristics of Test Problems Problem Number of variables Number of minima Value of global minimum A 2 4 -9.0 B 2 10 -9.9 C 2 10 -9.3 D 2 10 -9.8 E 2 10 -13.0 F 5 5 -9.4 G 5 5 -10.1 H 5 10 -10.0 I 5 10 -8.9 J 5 20 -11.9 HARTMAN Test Results S2 S3 S4 S5 S6 , -9.0 -8.2' -8.6 -8.5 -8.7 -9.0 -8.9 -9.0 -8.7 -9.0 -9.0 -9.0 -9.0 -8.8 -9.0 0.0 0.0 0.0 20.0 0.0 -9.7 -9.0 -9.5 -9.1 -9.1 -9.8 -9.9 -9.9 -9.7 -9.8 - 9.9 -9.9 -9.8 -9.9 0.0 0.0 0.0 10.0 0.0 -8.3 -8.1 -8.8 -7.8 7.8 -7.7 6 -8.2 -9.1 -8.5 -8.1 -8.0 - 8.9 -8.6 -9.2 -8.7 -8.2 -8.2 53.3 3.3 43.3 83.3 80.0 -9.2 -7.8 -7.4 -8.8 -8.8 -9.5 -9.4 -8.5 -9.2 -9.4 i -9.6 -9.7 -9.6 -9.2 -9.6 : 30.0 6.7 33.3 73.3 33.3 -10.1 -11.8 -8.3 -9.5 -10.9 -10.2 -12.8 -10.5 -11.2 -12.6 -12.3 - 12.7 -12.9 -12.0 -13.0 -12.7 -12.8 3.3 30.0 0.0 6.7 3.3 -5.0 -6.4 -5.8 -0.8 -0.8 . -7.9 -5.0 -8.0 -8.7 -2.9 -3.1 -8.7 -5.6 -8.5 -8.9 -7.0 -7.5 86.7 43.3 33.3 80.0 76.7 -7.4 -7.3 -7.1 -7.5 -5.0 -4.7 -8.8 -9.7 -9.7 -8.3 -8.2 -10.0 -9.1 -9.9 -10.1 -9.5 -9.3 56.7 10.0 0.0 16.7 40.0 -7.0 -6.8 -7.4 -3.7 -3.6 -8.3 -7.3 -8.7 -9.2 -6.3 -7.2 -8.9 -7.7 -9.2 -9.7 -8.2 -8.9 93.3 56.7 20.0 60.0 50.0 -•7.6 -6.3 -6.5 -6.7 -4.2 -4.2 -7.4 -8.0 -7.8 -5.8 -5.3 .' -7.6 -8.4 -8.6 -6.9 -6.1 10.0 66.7 33.3 36.7 80.0 100.0 ' -6.3 -6.7 -6.5 -3.8 -3.6 • -8.8 -6.6 -7.4 -8.1 -5.3 -4.6 -9.7 -7.2 -8.8 -8.3 -7.4 -6.5 ■ 83.3 66.7 76.7 73.3 90.0 ome general conclusions: ' very .challenging since SO did nearly as well as most other hut frequently stops short of the global solution — it GLOBAL OPTIMIZATION 575 3. In general, SI, S3, and S4 perform about the same and better than the other strategies. 4. S5 and S6 exhibit slow initial convergence. Both frequently tend to concentrate the search effort around a good local minimum, which is not global. 5. On difficult problems even the best of these methods will frequently fail to locate the global minimum. It is also interesting to examine the entire graph of the number of function evaulations versus the best function value obtained for each strategy. These curves are shown for test function H in Figure 1. The results for function H are representative of those obtained for the other functions and serve to emphasize conclusions 2, 3, and 4, above. Note on this graph that S5 and S6 display a consistent decrease at an initial rate which is similar to that of the better strategies S3 and S4. However, since they start much higher on the graph, S5 and S6 never catch up. This is inherent in the methods. Given any starting point x k , S3 and S4 imme- diately search to a local minimum, and thus quickly get a fairly low objective function value. Starting from the same initial point, S5 and S6 merely note the objective value and proceed to check other points, doing a global survey instead of a local minimization. Thus, in the initial stages, S5 and S6 are essen- tially identical to SO, pure random search. It is only after considerable information has been accumu- lated that these methods can concentrate their attention on promising search areas. -3.0- -4.0 -5.0- -6.0 - -7.0 ■ ■ -8.0- -9.0 - -10.0- PERFORMANCE OF STRATEGIES ON TEST FUNCTION H (AVERAGE OF 30 TRIALS FOR EACH STRATEGY) H — 100 + o o o o o o + "» HXXXXxxxn XX """"XXrrr OOOOOOOOOOOOOOOooOb. FUNCTION EVALUATIONS > 1 1 1 1 - S2 ..S5 *"*x/xx/xTt«|6 S3 'ooo + °°ooooooo S4 ■+■ 4- 100 200 300 400 500 600 700 800 900 1000 FIGURE 1. Performance of strategies on test function H (average of 30 trials for each strategy). 576 J- K. HARTMAN A comparison of SO, Si, S2, S3, and S4 is also interesting. In general, it seems that in these strate- gies which alternate cycles of random searching with unconstrained minimizations, the best results are obtained by the methods which do the least random searching. Thus, SO is purely random search, and its performance is the worst. S2 requires several (perhaps many) random evaluations before each minimization to find a point better than the current best local solution, and its performance is second worst. Strategy SI selects one random point x k before each minimization while S3 selects one random direction d k . Their performance is similar and almost as good as that of S4 which makes no random selections between minimizations. This strongly suggests that an improved strategy will consist of frequent minimizations coupled with an improved way of selecting starting points which are promising and which also sample the entire region. In conclusion, it is appropriate to note that these six methods do not come near to exhausting the possible techniques for avoiding local solutions. Methods which are hybrids of these and entirely new methods should be tested. In particular, we hope to develop an algorithm which allocates unconstrained minimizations to various regions similar to the way strategy S6 allocates the individual points x k . Such a method would combine the rapid local optimizing power of the minimization method with a global analysis of the feasible region. REFERENCES 1 1 1 Bocharov, N. and A. A. Feldbaum, "An Automatic Optimizer for the Search of the Smallest of Several Minima," Automation and Remote Control Vol. 23, No. 3 (1962). [2] Brooks, S. H., "A Discussion of Random Methods for Seeking Maxima," Operations Research 6, 244 (1958). [3] Fiacco, A. V. and G. P. McCormick, Nonlinear Programming: Sequential Unconstrained Minimi- zation Techniques, John Wiley and Sons, Inc., New York, (1968). [4] Powell, M. J. D., "An Efficient Method for Finding the Minimum of a Function of Several Variables Without Calculating Derivatives," Computer Journal 7, 155 (1964). [5] Zakharov, V. V., "A Random Search Method," Engineering Cybernetics, 2, 26 (1969).. A NOTE ON MATHEMATICAL PROGRAMMING WITH FRACTIONAL OBJECTIVE FUNCTIONS B. Mond La Trobe University Bundoora, Melbourne, Australia and B. D. Craven University of Melbourne Melbourne, Australia INTRODUCTION Consider the fractional programming problem with linear constraints. Problem 1 (PI): (1) Maximize f(x)lg(x) Subject to (2) Ax^b (3) x^O. It is assumed that the problem is regular, i.e., that the constraint set is nonempty and bounded and that /and g do not simultaneously become zero. There has been a great deal of interest in various special cases of PI. In particular, if/ and g are linear, Charnes and Cooper [1] showed that optimal solutions can be determined from optimal solutions of two associated linear programming problems. Charnes and Cooper's result was extended to the ratio of two quadratic functions by Swarup [3]. He considered Pi with / and g quadratic, and showed that an optimal solution, if it exists, can be obtained from the solutions of two associated quadratic program- ming problems, each with linear constraints and one quadratic constraint. Sharma [2] considered Pi with / and g polynomials. He showed that an optimal solution, if it exists, can be obtained from the solutions of two associated programming problems where the objective function is a polynomial and the constraints are all linear except for one polynomial constraint. Here we consider a much wider class of functionals/and g, and obtain a theorem that includes as special cases the corresponding results of [1], [2], and [3]. NOTATION AND DEFINITIONS AeR 1 "*", xeR n , beR'", teR, /and g are mappings from R" into R. (f> denotes a monotone strictly increasing function from R into R, with (f>(t ) > for t > 0. 577 578 B. MOND AND B. D. CRAVEN For a specified function (/>, define the functions F and G, for real positive t and yeR", by (4) F(y, t)=f(ylt)<f>(t) and G{y, t)=g(ytt)4>(t). Define (5) F(y, 0) = lim F(y, t) and G(y, 0) = lim G(y, t) «-»o «-»o whenever these limits exist. Assume that G(0, 0) =0 whenever it exists. RESULTS Let us introduce the transformation y=tx, where for specified function <b and nonzero constant Ac/?, we require (6) G(y,*) = A. On multiplying numerator and denominator of (1) by 4>(t), and using (4) and (6), we obtain the asso- ciated nonlinear programming problem. Problem 2 (P2): Maximize F(y, t), Subject to (7) Ay-bt^0 (8) G(y,t) = A (9) y§0 (10) (§0 LEMMA: If the point (y, t) satisfies the constraints of P2, then t > 0. REMARK: This only requires proof if G(y, 0) is defined, by (5). This is automatically the case if /is affine and <\>{t) = t. By (8), the point (0, 0) is not feasible for P2. PROOF: Assume that the point (y, 0) is feasible for P2; then y^O. Let x be feasible for PL Since the constraints are linear, x + ky is feasible for Pi for any positive A:, contradicting the bounded- ness of the constraint set of PL THEOREM 1: If (i) < sgn A = sgn g(x*) for an optimal solution x* of Pi, and (ii) (y*, t*) is an optimal solution of P2, then y*/t* is an optimal solution of PL PROOF: Assume that the theorem is false, i.e., assume that there exists an optimal x* such that (11) f(x*)lg(x*) >f(y*lt*)lg(y*lt*). By condition (i), g(x*) - 0A for some 6 > 0. FRACTIONAL OBJECTIVE PUNCH, Consider t = <t>-*(H0), and y=<f>- 1 (l/b)x*. Then (j>(t)g(x*) = G(y, (10), and also (7) since the constraint is linear. Thus (v. /) is a I Now (12) Also (13) f(x*)lg(x*) = 4>{t)f(x*)lU >(*) {x*)]=F(y,t)l f(y*lt*)lg(y*lt*) )f(y*lt*)l[<Ht*)gi li*)IG(y*m ■■F(y*lt*)lb. Hence, for feasible (y,t), (11), (12), and (13) show that F(y, t) > tion that (y*,t*) is optimal for P2. If sgn g(x*) < 0, for x* an optimal solution of PI, then repla tional is unaltered and for the new denominator we have — g(x*) > 0. Thus, if PI has a solution, it can be obtained by solving, foi the two nonlinear programming problems. Problem 3 (P3): ■ ■ ■ ■ Subject to (14) Maximize F(y\ t) Av-bt^0 and Problem 4 (P4): Subject to (15) G(y,t) = l j20,fS0 Maximize ~F-{y, t) Ay-bt^0 G(y,0 = l SPECIAL CASES If/ and g are linear and 4>(t) — t, then our theorem gives thi If/and g are quadratic functions and 4>{t) = t 2 , we obtain the result nomials of degree m and <£>(') = t m , we obtain the result of Sharrna [2]. 580 B. MOND AND B. D. CRAVEN If / and g are homogeneous of degree k, and <b(t) = t k then F(y, t)—f(y), G(y, t) = g(y), and P2 takes a simple form. An example is f(x) = d t x+(x t Cxyi\ where C is a positive semidefinite matrix. REMARKS As noted in [2] and [3], even if G(y, t) is a convex function of y and t, the constraint set of P3 is not necessarily convex. Instead of P3, therefore, it is sometimes more convenient to deal with the fol- lowing Problem 3' (P3'): Maximize F(y, t) Subject to Ay—bt^0 G(y, t) =§ 1 y^O.fSO. If G(y, t) is a convex function of the vector variable (y, t) then the constraint set of P3' is convex. It should be noted that even if g(x) is convex with respect to x, G(y, t) need not be convex with respect to the vector variable (y, t). As an example, consider g(x) =x'Cx — k where C is a positive semi-definite symmetric matrix and A: is a positive scalar. Thus g(x) is convex with respect to*. Taking <b(t) = t 2 , G(y, t)=y'Cy—kt 2 which is not convex. If (y*, t*) is optimal for P3', t* > 0, and G(y*, t*) = 0, then max Pi may be °°, since x* = y*/t* is feasible for PI, and g(x*) = 0. If G(y*,t*) =Ai, where 0< Ai < oo, then (y*,t*) is also optimal for P2 with A = Ai, so Theorem 1 applies. However, the optimum of P3' can occur at (y, t ) = (0, 0), which does not correspond to an optimum of Pi. For example, if Pi is the program (for a real variable *): — x — 3 Pi: Maximize f{x) = — —r— subject to x ^ 2 and x i? 0, then taking <f)(t) = t, the corresponding jc~t 1 P3' is: P3': Maximize— y — 3* subject to y+t Sl,yg0,t50, y — 2t ^ 0. The maximum for P3' then occurs at (y, t) = (0, 0) ; but the maximum for Pi occurs at x = 2. Similarly, instead of (P4), it might be more convenient to consider Problem 4' (P4') Maximize— F{y, t) Subject to Ay-bt^0 -G(y,t)^l yg0,(g0. FRACTIONAL OBJECTIVE FUNCTIONS 581 If G is concave in the vector variable (y, t), the constraint set of P4' is convex. REFERENCES [1] Charnes, A. and W. W. Cooper, "Programming With Linear Fractional Functionals," Nav. Res. Log. Quart. 9, 181-186 (1962). [2] Sharma, I. C, "Feasible Direction Approach to Fractional Programming Problems," Opsearch 4, 61-72 (1967). [3] Swarup, K., "Programming with Quadratic Fractional Functionals," Opsearch 2, 23-30 (1965). NEWS AND MEMORANDA Mathematical Models of Target Coverage and Missile Allocation The Military Operations Research Society announces that it now has copies of its first monograph, "Mathematical Models of Target Coverage and Missile Allocation" by A. Ross Eckler and Stefan A. Burr, available for sale. The book may be purchased for $7.50 postpaid by contacting: MORS, 101 South Whiting St., Alexandria, Va. 22304. The monograph presents a comprehensive summary of analytical models primarily used for problems in strategic defense but applicable to a wide variety of more generalized resource allocation problems. The topics discussed include models of defended point targets, circular targets, gaussian targets, generalized area targets, groups of identical targets, and nonidentical targets. Offense and defense strategies are examined and under alternative assumptions of information available to both sides. An extensive bibliography is included. U.S. GOVERNMENT PRINTING OFFICE: 1973— 541 387: 1 583 INFORMATION FOR CONTRIBUTORS The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of scientific information in logistics and will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. Each manuscript which is considered to be suitable material for the QUARTERLY is sent to one or more referees. Manuscripts submitted for publication should be typewritten, double-spaced, and the author should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted with the original. A short abstract (not over 400 words) should accompany each manuscript. This will appear at the head of the published paper in the QUARTERLY. There is no authorization for compensation to authors for papers which have been accepted for publication. Authors will receive 250 reprints of their published papers. Readers are invited to submit to the Managing Editor items of general interest in the field of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections of the QUARTERLY. NAVAL RESEARCH LOGISTICS QUARTERLY SEPTEMBER 197,' VOL. 20, NO. 3 NAVSO P-1278 CONTENTS ARTICLES Sequential Determination of Inspection Epochs for Reliability Systems with General Lifetime Distributions An Empirical Bayes Estimator for the Scale Parameter of the Two-Parameter Weibull Distribution Optimal Allocation of Unreliable Components for Maximizing Expected Profit Over Time A Continuous Submarine Versus Submarine Game Total Optimality of Incrementally Optimal Allocations An Approach to the Allocation of Common Costs of Multi- Mission Systems An Explicit General Solution in Linear Fractional Programming Using Decomposition in Integer Programming Numerical Treatment of a Class of Semi-Infinite Programming Problems Min/Max Bounds for Dynamic Network Flows Production-Allocation Scheduling and Capacity Expansion Using Network Flows Under Uncertainty Concave Minimization over a Convex Polyhedron Estimation of a Hidden Service Distribution of an M/G/oo System The Single Server Queue in Discrete Time- Numerical Analysis III Some Experiments in Global Optimization A Note on Mathematical Programming with Fractional Objec- tive Functions News and Memoranda S. ZACKS, 377 . W. J. FENSKE G. K. BENNETT 31 C. G. HENIN E. LANGFORD L. D. STONE 419 R. T. CROW A. CHARNES, W. W. COOPER L. SCHRAGE S. A. GUSTAFSON, K. O. KORTANEK W. L. WILKINSON J. PRAWDA H. A. TAHA L. L. GEORGE, A. C. AGRAWAL M. F. NEUTS, E. KLIMKO J. K. HARTMAN B. MOND, B. D. CRAVEN OFFICE OF NAVAL RESEARCH Arlington, Va. 22217